| PONDR® Home | Create a new User Account | Log in to a User Account | Visit Molecular Kinetics | Visit Dr. Dunker's lab page | Visit Dr. Obradovic's lab page | Visit our Research Service Center |
Disordered regions (DRs) are entire proteins or regions of proteins which lack a fixed tertiary structure, essentially being partially or fully unfolded. Such disordered regions have been shown to be involved in a variety of functions, including DNA recognition, modulation of specificity/affinity of protein binding, molecular threading, activation by cleavage, and control of protein lifetimes. Although these DRs lack a defined 3-D structure in their native states, they frequently undergo disorder-to-order transitions upon binding to their partners.
As it is known that sequence determines structure, we assumed that sequence would determine lack of structure as well. To test this, we developed a series of neural network predictors (NNPs) that use amino acid sequence data to predict disorder in a given region. This collection of Predictors of Natural Disordered Regions is termed PONDR®.
PONDR® functions from primary sequence data alone. The predictors are feedforward neural networks that use sequence information from windows of generally 21 amino acids. Attributes, such as the fractional composition of particular amino acids or hydropathy, are calculated over this window, and these values are used as inputs for the predictor. The neural network, which has been trained on a specific set of ordered and disordered sequences, then outputs a value for the central amino acid in the window. The predictions are then smoothed over a sliding window of 9 amino acids. If a residue value exceeds a threshold of 0.5 (the threshold used for training) the residue is considered disordered.
The default PONDR® predictor is VL-XT; the XL1 and CaN predictors are also available.
The VL-XT predictor integrates three feedforward neural networks: the VL1 predictor from Romero et al. 2000and the N- and C- terminal predictors (XT) from Li et al. 1999. VL1 was trained using 8 disordered regions identified from missing electron density in X-ray crystallographic studies, and 7 disordered regions characterized by NMR. The XT predictors were also trained using X-ray crystallographic data. The attributes used by these predictors are listed in Table 1 (taken from Romero et al. 2000). Output for the VL1 predictor starts and ends 11 amino acids from the termini. The XT predictors ouptput predictions up to 14 amino acids from their respective ends. A simple average is taken for the overlapping predictions; and a sliding window of 9 amino acids is used to smooth the prediction values along the length of the sequence. Unsmoothed prediction values from the XT predictors are used for the first and last 4 sequence positions.
The XL1 predictor is a feedforward neural network optimized to predict regions of disorder greater than 39 amino acids (Romero et al., 1997). It was trained on 7 of the 8 disordered regions identified from missing electron density that were used to train the VL1 predictor. The attributes used by this predictor are listed in Table 1 (taken from Romero et al., 2000). This predictor uses a sliding window of 9 amino acids to smooth the prediction values along the length of the sequence, so predictions are only provided starting and ending 15 amino acids from the termini.
The CaN predictor is a feedforward neural network that was trained on regions of 13 Calcineurin proteins that were identified by sequence homology with the known disordered region of human Calcineurin (Romero et al., 1997). The attributes used by this predictor are listed in Table 1. This predictor shows poor out of sample accuracy, but in some cases the contrast of its output with other predictors provides insight into binding regions of disordered sequences (Garner et al., 1999).
| Table 1. Attributes used by various PONDR®s | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| PONDR® | ATTRIBUTES | |||||||||
| XL1 | Flexibility | Hydropathy | C | W | Y | H | D | E | K | S |
| VL1 | Coordination number | Net charge | WFY | W | Y | F | D | E | K | R |
| XN | Coordination number | V | VIYFW | M | N | H | D | PEVK | ||
| XC | Coordination number | Hydropathy | VIYFW | M | T | H | PEVK | R | ||
| CaN | beta-moment | V | F | W | Y | H | C | E | S | R |
The graph shows the residue by residue output of the neural network. Any region that exceeds 0.5 on the Y-axis is considered disordered. Note that as the length of the predicted disordered region increases, the accuracy of the predicition increases also. Extremely long predictions of disorder have a very high level of confidence. In some cases, extreme minimas within regions structurally characterized to be disordered correlate to binding regions (see Garner et. al., 1999)
The log file shows the sequence, and direcly beneath, a capital "D" where the output of the predictor exceeded the threshold. At the top of the log output is the residue locations of contiguous predictions of disorder, their length and their average score.
| Table 2. PONDR® accuracies | |||
|---|---|---|---|
| Predictor | False Negative (dis_ALL) |
False Positive (O_PDB_S25) |
5-cross Validation |
| VL-XT | 40% | 22% | 75 - 83% |
| XL1 | 62% | 19% | 73 ± 4% |
| CaN | 39% | 34% | 83 ± 5% |
Sequence complexity of disordered protein.
Romero, P., Z. Obradovic, X. Li, E. Garner, C. Brown, and A. K. Dunker, Proteins: Struct. Funct. Gen., 2001, 42:38-48.
Sequence data analysis for long disordered regions prediction in the calcineurin family.
Romero, P., Z. Obradovic, and A. K. Dunker, Genome Informatics, 1997, 8:110-124.
For more reading on the development, use, and applications of PONDR®, please refer to our bibliography page.
A comprehensive review regarding disordered proteins was written by Peter Wright and Jane Dyson, follow this link for the abstract.
To contact us:
Molecular Kinetics, Inc.
351 West 10th Street; Suite 318
Indianapolis, Indiana 46202
Tel: 317-638-0244
Fax: 317-638-0295
e-mail: main@molecularkinetics.com
PONDR® was developed by P. Romero, A.K. Dunker, X. Li, and Z. Obradovic.
CGI, web interface, and miscellaneous coding was done by E. Garner, C. Crosetto and J. Mueller.
Access to PONDR® is provided by Molecular Kinetics (6201 La Pas Trail - Ste 160, Indianapolis, IN 46268;
www.molecularkinetics.com; main@molecularkinetics.com) under license from the WSU Research Foundation.
PONDR® is copyright ©1999 by the WSU Research Foundation, all rights reserved.
Molecular Kinetics, Inc., Washington State University and the WSU Research Foundation and their several employees and consultants assume no liability, either real or implied, from the use of PONDR® in any of its forms or the results of its predictions for any damage, loss of time, loss of profit, either real or potential, or any other damage or loss that may arise from the use of PONDR® in any of its forms or from the results of its predictions.
Copyright ©2001, 2002 Molecular Kinetics, Inc. All rights reserved.
Last updated: Apr 26, 2004