Disordered regions (DRs) are entire proteins or regions of proteins which lack a fixed tertiary structure, essentially being partially or fully unfolded. Such disordered regions have been shown to be involved in a variety of functions, including DNA recognition, modulation of specificity/affinity of protein binding, molecular threading, activation by cleavage, and control of protein lifetimes. Although these DRs lack a defined 3-D structure in their native states, they frequently undergo disorder-to-order transitions upon binding to their partners.
As it is known that sequence determines structure, we assumed that sequence would determine lack of structure as well. To test this, we developed a series of neural network predictors (NNPs) that use amino acid sequence data to predict disorder in a given region. This collection of Predictors of Natural Disordered Regions is termed PONDR®.
PONDR® functions from primary sequence data alone. The predictors are feedforward neural networks that use sequence information from windows of generally 21 amino acids. Attributes, such as the fractional composition of particular amino acids or hydropathy, are calculated over this window, and these values are used as inputs for the predictor. The neural network, which has been trained on a specific set of ordered and disordered sequences, then outputs a value for the central amino acid in the window. The predictions are then smoothed over a sliding window of 9 amino acids. If a residue value exceeds a threshold of 0.5 (the threshold used for training) the residue is considered disordered.