Are knowledge-based potentials derived from protein structure sets discriminative with respect to amino acid types?
The parametric description of residue environments through solvent accessibility, backbone conformation, or pairwise residue-residue distances is the key to the comparison between amino acid types at protein sequence positions and residue locations in structural templates (condition of protein sequence-structure match). For the first time, the research results presented in this study clarify and allow to quantify, on a rigorous statistical basis, to what extent the amino acid type-specific distributions of commonly used environment parameters are discriminative with respect to the 20 amino acid types. Relying on the Bahadur theory, we estimate the probability of error in a single-sequence-structure alignment based on weak or absent discriminative power in a learning database of protein structure. We present the results for many residue environment variables and demonstrate that each fold description parameter is sensitive with respect to only a few amino acid types while indifferent to