I-Mutant: a tool for predicting
protein stability upon mutation
Introduction
I-Mutant is a neural-network-based web server
for the automatic prediction of protein stability changes upon single-site
mutations. The tool was trained on a data set derived from ProTherm [1], presently the most comprehensive database
of protein mutations. When trained/tested with a cross validation procedure,
I-Mutant correctly predicts whether the protein mutation stabilises or
destabilises the protein structure in 80% of the cases (on the S1615
set consisting of 1615 mutations web server provides the values of free
energy change predictions computed with the energy-based FOLD-X tool. By
coupling the FOLD-X predictions with those of I-Mutant, and considering
the reliability index value of the latter, the joint-method achieves an accuracy
of 93% on one third of the database, making I-Mutant a valuable tool for
protein design and mutation.
|
Results
In the table we report some parameters that score
the efficiency of our method
Q2 P(+) Q(+) P(-) Q(-) C I-Mutant 0.81 0.71 0.52 0.83 0.91 0.49
The overall accuracy Q2 is:
Q2=p/N
where p is the total number of correctly
predicted residues and N is the total number of residues.
The correlation coefficient C is defined as:
C(s)=[ p(s)n(s)-u(s)o(s) )] / D
where D is the normalization factor
D =[(p(s)+u(s))(p(s)+o(s))(n(s)+u(s))(n(s)+o(s))]1/2
for each class s (+ and -, for increasing
and decreasing stability, respectively); p(s) and n(s) are the total number
of correct predictions and correctly rejected assignments, respectively,
and u(s) and o(s) are the numbers of under and over predictions.
The coverage for each discriminated structure s is evaluated as:
Q(s)=p(s)/[ p(s)+u(s)]
where p(s) and u(s) are used in previous equations The probability
of correct predictions P(s) (or accuracy for s) is computed as:
P(s)=p(s) / [p(s) + o(s)]
where p(s) and o(s) are previous defined (ranging from 1
to 0).
|
Required Inputs
PDB code: PDB protein code [2]
Chain: Chain label. Default value:
"_"
Position: PDB residue position
Temperature: Temperature in Celsius
degree [0-100]
pH: negative logarithm of H+ concentration
[0-14]
FOLD-X: Post your query to FOLD-X Server
e-mail: Insert your e-mail. The output
of our program will be send to your address
Outputs
The output consists of a table listing
the sign of the predicted stability changes upon the 19 possible mutations
for a given PDB position.
The RSA value (Relative Solvent
Accessible Area) is caluculated using the DSSP program [3]. It is obtained
dividing the surface area calculated (DSSP program) by the relative aminoacid
surface [4].
The RI value (Reliability Index) is calculated from the output
of the neural network O
RI=20*abs(O-0.5)
If the FOLD-X
option is selected our program posts a query to FOLD-X Server [5] and sends via e-mail
the values of DG (free energy variation) and DDG (change in free energy variation upon
mutation) expressed in kcal/mol.
In case FOLD-X does not answer
for any kind of server trouble, we return "data NA" instead of the DDG and DG values .
Possible errors may occur when PDB files
contain broken chains or a different numbering of residues than expected
by the user.
[1] Gromiha MM, An J, Kono H, Oobatake M, Uedaira H, Prabakaran P,
Sarai A (2000). ProTherm, version 2.0: thermodynamic database for proteins
and mutants. Nucleic Acids Res. 28, 283-285.
[2] Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig
H, Shindyalov IN, Bourne PE (2000). The Protein Data Bank. Nucleic Acids
Res. 28, 235-242.
[3] Kabsch W, Sander C (1983). Dictionary of protein secondary structure:
pattern of hydrogen-bonded and geometrical features. Biopolymers.
22, 2577-2637.
[4] Chothia C (1976). The Nature of the Accessible and Buried Surfaces
in Proteins. J. Mol. Biol. 105, 1-14.
[5] Guerois R, Nielsen JE, Serrano L (2002). Predicting changes in
the stability of proteins and protein complexes: A study of more than
1000 mutations. J. Mol. Biol. 320, 369-387.
|