Our goal is to understand evolution and population genetics in microbial and viral populations. Broadly, we ask two types of questions. What is the structure of the enormous and high-dimensional map between the genotypes accessible to a population and relevant phenotypes? And how does evolution navigate this landscape, as mediated by factors such as selection pressures, population size, and mutation rate? We focus on characterizing statistical properties of genotype-phenotype maps and of evolutionary trajectories, driven by our view that statistical features of evolution are in principle predictable.
Mapping the genetic basis of complex traits in yeast
A large scientific community focuses on characterizing the genetic architecture of complex traits in
humans, but there are fundamental constraints on doing quantitative genetics in humans.
We take an orthogonal approach, developing methods that dramatically increase the power and
throughput of quantitative genetics in a model organism, budding yeast. We have
developed efficient and accurate genotyping using low-coverage sequencing, combinatorial barcoding
to enable accurate high-throughput phenotyping, and statistical methods to detect
densely-spaced small-effect QTLs as well as epistatic and dominance effects.
We are using these methods to infer genotype-phenotype maps for multiple traits across yeast crosses
and to probe the architecture of genetic networks. We are also developing computational tools to
infer the structure of these maps, such as to fine-map causal loci, identify epistatic effects, and
investigate pleiotropy. Finally, we are building on a recently developed retron-based barcoded
CRISPR system to conduct large-scale direct experimental validation.
Binding landscapes for immune-pathogen coevolutionary dynamics:
Our adaptive immune systems are engaged in a constant coevolutionary struggle with pathogens, as
pathogens adapt to evade our immune response and our immune repertoires
shift in turn. These coevolutionary dynamics take place across a vast and high-dimensional landscape
of potential pathogen and immune receptor sequence variants (antibodies and T-cell receptors).
We are working to empirically characterize the genotype-phenotype maps that underlie these
interactions. We focus on several key phenotypes (e.g. protein stability, binding affinity of
antibodies to relevant antigens or of pathogens to relevant
host proteins, and neutralization of pathogenic strains by sera) to do so.
To more comprehensively analyze the enormous and high-dimensional genotype-phenotype maps relevant
to immune-pathogen coevolution, my lab and others have developed methods such as Tite-Seq to
dramatically increase the throughput of phenotypic measurements. Together with the theoretical
frameworks developed in other aspects of our research, we work to understand how
evolutionary dynamics in the immune system determine the success or failure of adaptive immune
responses, and to provide insight into why specific antibodies do or do not emerge.
Direct observations of the dynamics of molecular evolution
In a number of studies, we have observed the dynamics of molecular evolution in laboratory
experiments, highlighting the critical role of hitchhiking and clonal interference in constraining
evolution. We have also directly quantified how recombination affects the
efficiency of selection, analyzed the role of epistatic interactions, and studied the spontaneous
evolution of ecological interactions. More recently, we have developed a new approach to track
evolutionary dynamics using “renewable” DNA barcoding methods, which allow us to follow the fates
and competition of individual cell lineages at frequencies as low as one in a million.
We are working to extend these observational methods into natural populations, recently such as
those of budding yeast and bacteria in non-aseptic bioethanol
production in million-liter open fermenters in Brazil. We are launching new
directions involving tracking immune-pathogen coevolutionary dynamics and within-strain evolutionary
dynamics in host-associated microbial communities.
Evolutionary dynamics in rapidly evolving populations
Existing methods in theoretical population genetics have been of limited utility in analyzing
rapidly evolving microbial and viral populations, where selection often acts on multiple linked
mutations simultaneously. The basic problem is that each mutation occurs at first in a single
individual, making its fate crucially dependent on genetic drift, the other mutations in this
genetic background, and nonlinear interactions with competing lineages. In past work, we have
developed a theoretical framework for studying these effects. We and others have applied this
approach to predict the rate and genetic basis of adaptation and the statistics of frequency changes
of each mutation through time. However, it remains unclear how rapidly evolving microbial and
viral populations are affected by selection pressures that fluctuate across space and time, by the
interactions between recombination and epistasis, or by ecological interactions in more complex
communities. We are working to develop new theoretical methods to
explain these effects. Our longer-term goal is to use this theory as the basis for more powerful and
principled ways to infer evolutionary history from sequence data.
Predicting how linked selection shapes patterns of diversity
A central goal of population genetics is to predict how natural selection shapes patterns of genomic
sequence diversity. However, most existing methods are limited to looking for deviations from
neutral expectations, or to explaining the action of selection at a single locus. Building on
earlier ideas, we have developed a “structured
coalescent” framework to account for these effects. We have used this approach in a series of
studies to analyze how various forms of linked selection affect expected patterns of sequence
diversity. We have also recently introduced an entirely novel forward-time approach to these
questions. We plan to continue this line of work to analyze complications critical for interpreting
diversity in natural populations (e.g. interactions between recombination and linked selection) and
to analyze more complex aspects of genetic diversity (e.g. those involving samples taken at
different times).