The University of Chicago Header Logo

Data-driven, evolution-based design of proteins


Collapse Overview 
Collapse abstract
Project Summary: Evolution builds proteins with a remarkable combination of characteristics. They can fold spontaneously and carry out difficult chemical reactions, but also are robust to perturbation and able to adapt as conditions of fitness fluctuate. In recent years, sequence-based statistical models have provided specific models for how all these properties are encoded in the amino acid sequence of proteins. Here, we propose a data-driven, evolution-based design (EBD) process that, with the developments outlined here, can address several basic problems in protein mechanism and evolution. We will unify and optimize approaches for EBD and then apply it (1) to quantify the functional sequence space of a protein family, (2) to parse the constraints on paralogs and orthologs of a protein family, and (3) to understand how substrate specificity in an enzyme can adapt through a process of stepwise variation and selection. The work is extensively supported by preliminary data, and is enabled by new technologies for statistical inference, gene synthesis, and high-throughput functional assays, both in vitro and in vivo. The outcomes will be a unified computational framework for sequence-based statistical inference, and an serious test of the power of emerging evolution-based protein design approaches to understand and engineer protein molecules.
Collapse sponsor award id
R01GM141697

Collapse Biography 

Collapse Time 
Collapse start date
2021-08-01
Collapse end date
2025-05-31