Research

Our goal is to understand evolution and population genetics in microbial and viral populations. Broadly, we ask two types of questions. What is the structure of the enormous and high-dimensional map between the genotypes accessible to a population and relevant phenotypes? And how does evolution navigate this landscape, as mediated by factors such as selection pressures, population size, and mutation rate? We focus on characterizing statistical properties of genotype-phenotype maps and of evolutionary trajectories, driven by our view that statistical features of evolution are in principle predictable.

Mapping the genetic basis of complex traits in yeast
A large scientific community focuses on characterizing the genetic architecture of complex traits in humans, but there are fundamental constraints on doing quantitative genetics in humans. We take an orthogonal approach, developing methods that dramatically increase the power and throughput of quantitative genetics in a model organism, budding yeast. We have developed efficient and accurate genotyping using low-coverage sequencing, combinatorial barcoding to enable accurate high-throughput phenotyping, and statistical methods to detect densely-spaced small-effect QTLs as well as epistatic and dominance effects. We are using these methods to infer genotype-phenotype maps for multiple traits across yeast crosses and to probe the architecture of genetic networks. We are also developing computational tools to infer the structure of these maps, such as to fine-map causal loci, identify epistatic effects, and investigate pleiotropy. Finally, we are building on a recently developed retron-based barcoded CRISPR system to conduct large-scale direct experimental validation.

Binding landscapes for immune-pathogen coevolutionary dynamics:
Our adaptive immune systems are engaged in a constant coevolutionary struggle with pathogens, as pathogens adapt to evade our immune response and our immune repertoires shift in turn. These coevolutionary dynamics take place across a vast and high-dimensional landscape of potential pathogen and immune receptor sequence variants (antibodies and T-cell receptors). We are working to empirically characterize the genotype-phenotype maps that underlie these interactions. We focus on several key phenotypes (e.g. protein stability, binding affinity of antibodies to relevant antigens or of pathogens to relevant host proteins, and neutralization of pathogenic strains by sera) to do so. To more comprehensively analyze the enormous and high-dimensional genotype-phenotype maps relevant to immune-pathogen coevolution, my lab and others have developed methods such as Tite-Seq to dramatically increase the throughput of phenotypic measurements. Together with the theoretical frameworks developed in other aspects of our research, we work to understand how evolutionary dynamics in the immune system determine the success or failure of adaptive immune responses, and to provide insight into why specific antibodies do or do not emerge.

Direct observations of the dynamics of molecular evolution
In a number of studies, we have observed the dynamics of molecular evolution in laboratory experiments, highlighting the critical role of hitchhiking and clonal interference in constraining evolution. We have also directly quantified how recombination affects the efficiency of selection, analyzed the role of epistatic interactions, and studied the spontaneous evolution of ecological interactions. More recently, we have developed a new approach to track evolutionary dynamics using “renewable” DNA barcoding methods, which allow us to follow the fates and competition of individual cell lineages at frequencies as low as one in a million. We are working to extend these observational methods into natural populations, recently such as those of budding yeast and bacteria in non-aseptic bioethanol production in million-liter open fermenters in Brazil. We are launching new directions involving tracking immune-pathogen coevolutionary dynamics and within-strain evolutionary dynamics in host-associated microbial communities.

Evolutionary dynamics in rapidly evolving populations
Existing methods in theoretical population genetics have been of limited utility in analyzing rapidly evolving microbial and viral populations, where selection often acts on multiple linked mutations simultaneously. The basic problem is that each mutation occurs at first in a single individual, making its fate crucially dependent on genetic drift, the other mutations in this genetic background, and nonlinear interactions with competing lineages. In past work, we have developed a theoretical framework for studying these effects. We and others have applied this approach to predict the rate and genetic basis of adaptation and the statistics of frequency changes of each mutation through time. However, it remains unclear how rapidly evolving microbial and viral populations are affected by selection pressures that fluctuate across space and time, by the interactions between recombination and epistasis, or by ecological interactions in more complex communities. We are working to develop new theoretical methods to explain these effects. Our longer-term goal is to use this theory as the basis for more powerful and principled ways to infer evolutionary history from sequence data.

Predicting how linked selection shapes patterns of diversity
A central goal of population genetics is to predict how natural selection shapes patterns of genomic sequence diversity. However, most existing methods are limited to looking for deviations from neutral expectations, or to explaining the action of selection at a single locus. Building on earlier ideas, we have developed a “structured coalescent” framework to account for these effects. We have used this approach in a series of studies to analyze how various forms of linked selection affect expected patterns of sequence diversity. We have also recently introduced an entirely novel forward-time approach to these questions. We plan to continue this line of work to analyze complications critical for interpreting diversity in natural populations (e.g. interactions between recombination and linked selection) and to analyze more complex aspects of genetic diversity (e.g. those involving samples taken at different times).