SOTAVerified

Selection originating from protein stability/foldability: Relationships between protein folding free energy, sequence ensemble, and fitness

2016-12-30Unverified0· sign in to hype

Sanzo Miyazawa

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Assuming that mutation and fixation processes are reversible Markov processes, we prove that the equilibrium ensemble of sequences obeys a Boltzmann distribution with (4N_e m(1 - 1/(2N))), where m is Malthusian fitness and N_e and N are effective and actual population sizes. On the other hand, the probability distribution of sequences with maximum entropy that satisfies a given amino acid composition at each site and a given pairwise amino acid frequency at each site pair is a Boltzmann distribution with (-_N), where _N is represented as the sum of one body and pairwise potentials. A protein folding theory indicates that homologous sequences obey a canonical ensemble characterized by (- G_ND/k_B T_s) or by (- G_N/k_B T_s) if an amino acid composition is kept constant, where G_ND G_N - G_D, G_N and G_D are the native and denatured free energies, and T_s is selective temperature. Thus, 4N_e m (1 - 1/(2N)), - _ND, and - G_ND/k_B T_s must be equivalent to each other. Based on the analysis of the changes ( _N) of _N due to single nucleotide nonsynonymous substitutions, T_s, and then glass transition temperature T_g, and G_ND are estimated with reasonable values for 14 protein domains. In addition, approximating the probability density function (PDF) of _N by a log-normal distribution, PDFs of _N and K_a/K_s, which is the ratio of nonsynonymous to synonymous substitution rate per site, in all and in fixed mutants are estimated. It is confirmed that T_s negatively correlates with the mean of K_a/K_s. Stabilizing mutations are significantly fixed by positive selection, and balance with destabilizing mutations fixed by random drift. Supporting the nearly neutral theory, neutral selection is not significant.

Tasks

Reproductions