Department of Systems & Computational Biology

Past Seminars

Seminars: 2013/2014

David Sivak, Ph.D.
University of California, San Francisco
Free energy, optimal control, and optimal response in microscopic non-equilibrium systems

Abstract: Molecular machines are protein complexes that convert between different forms of energy, and they feature prominently in essentially any major cell biological process. A plausible hypothesis holds that evolution has sculpted these machines to efficiently transmit energy and information in their natural contexts, where energetic fluctuations are large and non-equilibrium driving forces are strong. Toward a systematic picture of efficient, stochastic, non-equilibrium energy and information transmission, I present theoretical developments in three distinct yet related areas of non-equilibrium statistical mechanics: How can we measure how far from equilibrium a driven system is? How do we find efficient methods to push a system rapidly from one state to another? And finally, what are generic properties of systems that efficiently harness the energy and information present in environmental fluctuations? For further details: http://davidsivak.com/

 
 
 

Fred Davis, Ph.D.
HHMI Janelia Farm
Cell type-specific genomics of Drosophila neurons

Abstract: The diversity of gene expression across cell types is particularly striking in the myriad cell types of the nervous system. While most genomic methods are typically applied to whole tissues, new technologies have started to make these methods applicable to individual cell types, including neurons in the brain. I will begin by describing the first cell type-specific gene expression and histone modification profiles measured from distinct neuronal subpopulations in the Drosophila brain. In addition to recovering known gene expression differences, these profiles indicate significant cell type–specific chromatin modifications. In particular, a small subset of differentially expressed genes exhibits a striking anti-correlation between repressive (H3K27me3) and activating (H3K27ac) histone modifications. These genes are enriched for transcription factors, recovering known and predicting new regulators of neuronal identity. I illustrate the utility of this chromatin pattern by demonstrating that it can be used as a genome-wide screen in mammalian systems to significantly enrich for transcription factors that can convert adult cell identity. I will close by describing the utility of cell type-specific genomic profiling in the context of neural circuits, presenting expression profiles measured from the first neuropil of the Drosophila visual circuit. Our results suggest that cell type-specific profiling of neuronal populations can illuminate how these neurons develop and function in the adult brain.

 
 
 

Lucas Ward, Ph.D.
Computer Science and Artificial Intelligence Laboratory, MIT
Using regulatory genomics to decipher disease genetics and evolutionary dynamics

Abstract: Association and linkage studies provide genome-wide information about the genetic basis of complex disease, but medical research has focused primarily on protein-coding variants, owing to the difficulty of interpreting noncoding mutations. This picture has changed with advances in the systematic annotation of functional noncoding elements. Evolutionary conservation, functional genomics, chromatin state, sequence motifs and molecular quantitative trait loci all provide complementary information about the regulatory function of noncoding sequences. I will first discuss problems in deciphering the transcriptional regulatory code, and work I have done in model organisms to studying the interplay between chromatin, regulatory motifs, and evolutionary turnover. I will then discuss regulatory genomics modeling in in human using large compendia of epigenomic maps, which has allowed us to generate hypotheses about which variants on disease haplotypes are causal, to perform systems-level analyses which reveal regulatory pathways underlying complex phenotypes, and to detect lineage-specific purifying selection through aggregated patterns of human diversity. Finally, I will discuss how these models pave the way to interpret mutations found through clinical whole-genome sequencing and to perform rare-variant association studies, and how they will let us better understand our evolutionary history.

 
 
 

Istvan Borzak, Ph.D.
University of West Hungary, Faculty of Natural Sciences
Transition path sampling and many body properties in viscosity and water nucleation simulations

Abstract: The Transition Path Sampling (TPS) method is presented and its wide area of possible applications ranging from proton transfer and ice nucleation in water, to biochemical processes like protein folding. The method helps studying rare events and grasping the underlying many-body mechanism of the processes. The case of ice nucleation is discussed. Viscosity is also a property that depends on all the coordinates and momenta of all particles in the fluid. A method based on the Transient Time Correlation Function (TTCF) formalism is presented, which enables us to calculate the shear viscosity of dense fluids not only at equilibrium but also in a very wide range of strain rates. Results are shown for simple model fluids. 

 
 
 

Ping Xie, Ph. D.
State University of New Jersey, Piscataway, New Jersey
Mechanisms of TRAF3 inactivation-initiated B lymphomagenesis

Abstract: TRAF3, a member of the TRAF family of cytoplasmic adaptor proteins, is employed in signaling by a variety of immune receptors, including the tumor necrosis factor receptor superfamily, Toll-like receptors, NOD-like receptors, and RIG-I-like receptors. To explore the in vivo functions of TRAF3 in B lymphocytes, we recently generated a genetically modified mouse model that has the TRAF3 gene specifically deleted in B cells. We found that TRAF3 deletion results in prolonged survival of mature B cells, which eventually leads to spontaneous development of B lymphomas in mice. Corroborating our findings, TRAF3 deletions and inactivating mutations were identified in human B cell neoplasms, including multiple myeloma, splenic marginal zone lymphoma, B cell chronic lymphocytic leukemia, and mantle cell lymphoma. We are currently investigating the molecular mechanisms of TRAF3 inactivation-initiated B lymphomagenesis using complementary human and mouse model systems. To approach this, we employ a number of cutting-edge strategies in our study, including microarray analyses, proteomics, bioinformatics, and deep sequencing. Here I present our new data of this project, which provide useful information for rational design of novel therapeutics and treatment strategies to combat human B cell malignancies.

 
 
 

Yitzhak Pilpel, Ph.D.
Associate Professor, Department of Molecular Genetics, Weizmann Institute of Science
A dual program for translation regulation in cellular proliferation and differentiation

Abstract: A dichotomous choice for metazoan cells is between a state of proliferation and differentiation. Measuring tRNA pools in various cell-types, we found two distinct subsets, one that is induced in proliferating cells, and repressed otherwise, and another with the opposite signature. Correspondingly, we found that genes serving cell autonomous functions and genes involved in multi-­‐cellularity obey distinct codon-­‐usage. Proliferation-­‐induced and differentiation-­‐induced tRNAs often carry anti-codons that correspond to the codons enriched among the cell-­‐autonomous and the multi-­‐cellularity genes, respectively. Since mRNAs of cell-autonomous genes are induced in proliferation and cancer in particular, the concomitant induction of their codon-­‐enriched tRNAs suggests coordination between transcription and translation. Histone modifications indeed change similarly in the vicinity of cell-­‐autonomous genes and their corresponding tRNAs, and in multi cellularity genes and their tRNAs, suggesting the existence of transcriptional programs coordinating tRNA supply and demand. Hence, we describe the existence of two distinct translation programs that operate during proliferation and differentiation. 

 
 
 

Evgeny Shmelkov, Ph. D.
New York University School of Medicine
Drug Discovery: Data to Knowledge

Abstract: Rapid development of high-throughput screening technologies has resulted in the accumulation of enormous quantities of biomedically relevant data. Understanding these data can eventually lead to the identification of more complex molecular biomarkers and development of more efficient therapeutic interventions for human diseases. Unfortunately, making scientific sense of the majority of these data has been generally difficult. Indeed, it is still true that the majority of therapeutic interventions in current clinical use were discovered by phenotypic screens, agnostic of their precise mechanism of action. Here, I will describe novel computational and conceptual approaches for bringing diverse biomedical data together in order to accelerate drug and vaccine discovery. In the first part, I will discuss rational vaccine design approaches addressing the challenge of antigenically variable infectious pathogens. In the second part, I will demonstrate how publicly available biological data can be integrated to gain novel insights on the organic basis of human diseases.

 
 
 

Prof. Elisha Moses
Department of Physics of Complex Systems, Weizmann Institute of Science
Dynamics in Small Neuronal Networks

Abstract: Cultured networks of neurons from rat hippocampus constitute a fascinating and important model for biological computation. While the individual neurons retain their physiological characteristics as in the intact brain, the structure and connectivity they generate in the network is much simpler to measure and analyze, and therefore to engineer and design. We have studied a variety of patterns and configurations of neurons, unraveling much of the intricate network structure underlying the dynamics of these cultures. We present measurements of information, propagation and structure both in 1d and 2d geometries, which underlie the computational capability of the culture. Two dimensional excitation dynamics are dominated by a percolation process in which a quorum is recruited for exciting a single neuron. One dimensional patterns allow the construction of logical devices, as well as the unraveling of fundamental questions related to the excitation of single neurons by an external field. We end by using cultured networks to show that pathology in neurons from Down Syndrome mice models is dominated by perturbed potassium channel regulation.

 
 
 

Chen Hou, Ph.D.
Missouri University of Science and Technology, Department of Biology Science
Energy Trade-Off Between Growth And Longevity

Abstract: In the last few decades, two types of intra-specific studies have highlighted the trade-off between growth and longevity. First, diet restriction (DR), as an environmental intervention, has been shown to suppress growth and extend the lifespan of a broad range of animals. Second, genetic studies have also shown that mice, whose growth hormone function is genetically modified (GM), grow slower and live longer than their wild-type siblings. In this talk, I will present a simple mechanistic model, which quantifies explicitly how DR and GM alter the animal’s energy budget, and channel metabolic energy to somatic maintenance by suppressing growth, thereby extending lifespan. Data from a diverse set of empirical studies on small rodents supports the quantitative predictions of the model. More importantly, the model reveals, and the empirical data confirms, that although DR and GM are two different methods to extend lifespan, i.e., environmental vs. genetic, the underlying mechanisms of them are the same from the energetic viewpoint. DR enhances animals’ health maintenance, whereas refeeding reverses the beneficial effects of DR. However, to what degree the reversal of refeeding reaches still remains controversial. The model reconciles the results of the refeeding studies and reveals the dynamic and reversible mechanism underlying the effects of diet on health. I will show that in some species the energetic cost of synthesizing biomass increases during growth, so the expensive compensatory growth induced by refeeding later in life offsets the benefits of cheap retarded growth induced by diet restriction early in life. Thus in these species, refeeding drives animals to allocate more energy to growth and less to maintenance, and therefore leads to poor health status and shorter lifespan compared to the free fed controls.

 
 

Seminars: 2012/2013

Dimitri “Mitya” Chklovskii, Ph.D.
Group Leader, Janelia Farm Research Campus, Howard Hughes Medical Institute
Understanding the building blocks of neural computation: Insights from connectomics and theory

Abstract: Animal behaviour arises from computations in neuronal circuits, but our understanding of these computations has been frustrated by the lack of detailed synaptic connection maps, or connectomes. For example, despite intensive investigations over half a century, the neuronal implementation of local motion detection in the insect visual system remains elusive. We developed a semi-automated pipeline using electron microscopy to reconstruct a connectome, containing 379 neurons and 8,637 chemical synaptic contacts, within the Drosophila optic medulla. By matching reconstructed neurons to examples from light microscopy, we assigned neurons to cell types and assembled a connectome of the repeating module of the medulla. Within this module, we identified cell types constituting a motion detection circuit, and showed that the connections onto individual motion-sensitive neurons in this circuit were consistent with their direction selectivity. Our identification of cell types involved in motion detection allowed targeting of extremely demanding electrophysiological recordings by other labs. Preliminary results from such recordings are consistent with a correlation-based motion detector. This demonstrates that connectomes can provide key insights into neuronal computations.

 
 
 

Andreas Doncic, Ph.D.
Stanford University, Department of Biology
Uncovering Single-Cell Dynamics Of Host-Pathogen Interactions

Abstract: To survive, all living cells must constantly interpret and react to large numbers of noisy signals in a changing environment. Importantly, upon exposure to the immune system, pathogens activate stress-specific responses that allow them to adapt and survive. Notably, these responses are stochastic: even in a clonal population, some cells live and adapt while some die. This suggests that measuring cell-to-cell differences in how the signal is interpreted and transduced will play a key role in pathogenicity. During my time as a postdoc, I have focused on understanding the dynamical properties that control cell fate decisions, using budding yeast as a model organism. To this end, I have developed a broadly applicable assay that allows extended tracking of key fluorescent proteins in individual cells with unprecedented accuracy. By exposing yeast cells to various stresses and following their dynamical responses, we have been able to uncover several previously unknown dynamical properties regulating cell fate decisions in this organism. I here propose to apply this assay to understanding the key determinants of fungal virulence, using yeast and neutrophils as a model interaction system. The underlying mechanisms necessary for yeast survival following neutrophil phagocytosis are currently poorly understood. Considering the highly stochastic outcome of this process, it is likely that the dynamical response of cells is extremely important.

 
 
 

Markus Dahlem, Ph.D.
Humboldt-Universitat zu Berlin
Migraine: A dynamical disease from molecular to cellular to whole organ level

Abstract: Computational, mathematical and experimental migraine models are introduced, focusing on a phenomenon called spreading depression (SD). SD is a transient pattern forming state that during its course massively perturbs the brain's ion homeostasis by seizure-like discharges. Under condition of acute stroke, SD can even be the switch between like and death for nerve cells. In migraine, SD is the key to a subtype, migraine with aura (MA). A mechanism is presented by which localized SD wave fragments are formed in the cortex. We investigated statistical properties in a computational model, which provides a dynamical understanding of ictogenesis in MA. According to this model, SD forms particular shapes mainly characterized by size and duration. Similar patterns have been observed with fMRI in migraine and are reported by patients as visual field defects. The results supports the controversial idea that SD can have a casual relationship with the headache phase in migraine. In particular three predictions are discussed: (i) that the cascade initiating the pain phase depends on a sufficiently large area instantaneously affected by SD, (ii) that SD in migraine without aura (MO) is neither lasting long not propagating far enough to cause noticeably aura symptoms because the initial perturbation covers a large area, and (iii) that only from a similar ictogenic focus SD can break away and propagate as a localized wave. This would also explain, why, on average, the headache is reported to be less severe in MA than in MO. Furthermore, neuromodulation techniques, which may affect the suggested pathways of the aura and pain formation, are briefly discussed.

 
 
 

Sasha Levy, Ph.D.
Stanford University
High-throughput lineage tracking reveals complex population dynamics in response to environmental change

Abstract: Epigenetic and genetic variation within closely-related cell populations could act as a risk-spreading strategy that increases the population's fitness in changing environments. Yet, little is known about the extent of either epigenetic or genetic variation in evolving populations, or if this variation results in meaningful fitness differences when the environment changes. We developed two novel high-throughput fitness assays to study variance within isogenic and evolving populations of Saccharomyces cerevisiae. First, using a high-throughput microscopy assay that monitors growth and survival of tens of thousands of microcolonies simultaneously, we find that clonal populations display broad distributions of growth rates and that slow growth predicts resistance to heat killing in a probabalistic manner. Growth rate and survival heterogeneity appears to be due to a combination of stochastic and deterministic factors, with one deterministic factor being the replicative age of the cell. Second, to follow lineage trajectories over longer time scales, we developed a genetic method to label otherwise isogenic cells with half a million unique molecular barcodes that serve as lineage tags during experimental evolution studies. By tracking the relative frequencies of lineage tags over time, we observe the granular dynamics of an evolving population and measure, at high resolution, its population parameters such as the beneficial mutation rate and the distribution of fitness effects sizes. We find a high rate of beneficial mutations, suggesting that, at most naturally occurring population sizes, competition between lineages drives high genetic variability. I will discuss future directions for these technologies, including the use of double barcodes for high-throughput studies of evolution across changing environments, statistical epistasis, genetic and protein-protein interactions.

 
 
 

Jianhua Xing, Ph.D.
Department of Biological Sciences, Virginia Tech
Physics meets biology: unraveling mysteries of biological complexity

Abstract: The physical world seems to be exceedingly complex, yet it is governed by simple, elegant laws of physics and chemistry that can be quantified mathematically. Inspired by the success of quantitative theories in describing the physical world, my research focuses on the application of theoretical and computational modeling with the goal of discovering organizing principles underlying complex living systems, and implications on medical sciences. In this talk I will focus on two projects done in my lab. Inflammatory responses of the innate immune system need to be tightly regulated. Excessive inflammation is related to numerous human diseases. Sepsis is such a pathological condition causing more death than prostate cancer, breast cancer, and AIDS combined in US. Priming refers to that repetitive exposure to stimulants results in non-additive augmented cellular responses, especially for immune cells. For example, studies show that sub-threshold low dose of bacterial lipopolysaccharide (LPS, endotoxin), a condition called low grade endotoxemia, may prime (or sensitize) the innate immune system for nonlinearly augmented inflammatory responses under an above-threshold LPS stimulation. By means of a computational search through the parameter space of a coarse-grained three-node network with a two-stage Metropolis sampling approach, we enumerate all the network topologies that can generate priming. The numerical analysis automatically reveals three major mechanisms (pathway synergy, suppressor deactivation, activator induction). We then apply the strategy suggested by the theoretical study to analyze the human macrophage microarray data, and revealed candidates for Inteferon-g induced priming effects. Recent breakthroughs of cell phenotype reprogramming impose theoretical challenge on unravelling the complexity of large circuits maintaining cell phenotypes coupled at many different epigenetic and gene regulation levels, and quantitatively describing the phenotypic transition dynamics. A popular picture proposed by Waddington views cell differentiation as a ball sliding down a landscape with valleys corresponding to different cell types separated by ridges. Based on theories of dynamical systems we establish a novel "epigenetic state network" framework that captures the global architecture of cell phenotypes, which allows us to translate the metaphorical low-dimensional Waddington's epigenetic landscape concept into a simple-yet-predictive rigorous mathematical framework of cell phenotypic transitions. Specifically, we simplify a high dimensional epigenetic landscape into a collection of discrete states corresponding to stable cell phenotypes connected by optimal transition pathways among them. We then apply the approach to the reprogramming process of fibroblasts to induced pluripotent stem cells (iPSC) and cardiomyocytes, and correctly predict intermediate states and multiple reprogramming pathways, which are supported by existing microarray data and other experiments. New experiments are further suggested.

 
 
 

Libusha Kelly, Ph.D.
Massachusetts Institute of Technology
Predicting the impact of genetic variation: from genome to ecosystem

Abstract: How does genetic variability influence cellular response to environmental pressures? I will explore this question in two seemingly disparate systems, the global ocean and the human genome, and end by proposing that an ecosystem-based perspective is vital for exploring the role of genetic variability in human disease and drug response. In the oceans, we use genomic and metagenomic approaches to identify habitat-specific patterns in phage gene abundance that signify environmental pressures. Furthermore, we identify regulatory motifs tied to host nutrient sensing and uptake systems and find conserved phage genes associated with multiple host metabolic processes. This work highlights the role of viruses as metabolic engineers of host cells and the influence of microbe/phage interactions in ecosystem functioning. The human body is also an ecosystem, however human genomes are typically studied in isolation from the microbial inhabitants that colonize us. In an effort to identify genetic signatures of altered drug response, we examined the effects of single nucleotide polymorphisms in a large membrane protein family implicated in drug transport and disease. Notably, we located conserved structural motifs associated with altered function and disease states in multiple transporters. However, future investigations of human genetic variability must look beyond the human genome alone; a perspective encompassing both human genetic variation and variation within the human microbiome is necessary to accurately predict both an individual’s predisposition to disease and his or her response to specific therapeutic agents.

 
 
 

Duygu Ucar, Ph.D.
Stanford University
Characterizing the regulatory role of combinatorial and spatial deposition of epigenetic marks

Abstract: Epigenetic mechanisms, such as post-translational modifications of histone proteins, play an important role in regulating gene expression. Individual histone modifications, such as acetylation, methylation, and phosphorylation, have been shown to regulate gene expression by changing chromatin structure and creating binding sites for effector proteins. In addition to co-occurence of an epigenetic mark with others, it has been shown that the spatial pattern of its deposition sites might have an impact on the functionality. We developed and utilized computational models to reveal the importance of spatial and combinatorial patterns of epigenetic mark deposition sites. Joint analyses of large scale histone modification maps are starting to reveal combinatorial patterns of histone modifications which are associated with functional DNA elements, providing support to the unified ‘histone code’ hypothesis. However, due to the lack of computational methods, only a small number of histone modification patterns have been associated with well-known functional DNA elements, e.g. promoters and enhancers. To identify the complete set of combinatorial and coherent histone modification patterns across the entire human genome, we propose a scalable bi-clustering algorithm, which identified many combinatorial histone modifications that are frequently repeated on the human DNA. Some of these patterns involve known modification combinations associated with functional DNA elements in addition to novel patterns identified with their potential functional roles that warrant further experimental characterization. In addition to combinatorial patterns of epigenetic marks, recent studies show that different spatial patterns at epigenetic mark deposition sites might also have differing functional meaning. For example longer H3K4me3 deposition sites are known to mark key regulators of a given cell. By the computational analysis of existing high-throughput datasets, we characterized the functional role of long H3K4me3 deposition sites in four different organisms including human.

 
 
 

Manu, Ph.D.
University of Chicago
Precision and robustness of segmentation gene expression during early Drosophila development

Abstract: Robustness to genetic and environmental perturbation is one of the hallmarks of animal development. During the past decade or so robustness has been documented at the molecular level, particularly for gene regulatory networks, although we have only now begun to identify the mechanisms underlying this property of development. I will present a set of related theoretical and experimental studies that establish that segmentation gene expression during the early development of Drosophila melanogaster exhibits robustness and identify multiple underlying gene regulatory mechanisms and features acting at many stages of segmentation. The segmentation genes are expressed in progressively finer patterns during the first three hours of development, ultimately leading to the establishment of the molecular prepattern of the segmented body plan of the Drosophila larva. One striking feature of segmentation is that the expression patterns of zygotically-expressed genes are placed much more precisely than those of the maternally-expressed ones. Using an experimentally-validated computational model, I will show that certain negative feedback loops in the gene network ensure that gene expression is precise. Furthermore, using methods from non-linear dynamics I will demonstrate that such feedback loops act to establish stable steady states that confer stability to segmentation gene expression. In a set of experimental investigations using whole-locus transgenes and live imaging, I will show that the cis-regulatory architecture of genes, that is, the number and placement of transcription-factor binding sites, is optimized for precision and buffering temperature perturbation. Finally, I will present evidence that segmentation, hitherto thought to proceed identically in males and females, is in fact sex-dependent during the earliest stages. Sex-dependent pattern formation will be traced to the incomplete dosage compensation of an X-linked gene and I will show that the system buffers this perturbation during later stages to produce sex-independent segmentation.

 
 
 

Alireza Soltani, Ph.D.
Stanford University
Compete and learn: synaptic reverberation and reward-dependent plasticity as common neural mechanisms underlying attention and adaptive decision

Abstract: Attention describes the mechanism by which the brain allocates its limited computational resources to behaviorally relevant objects or locations. Attention can also be viewed as a form of decision-making in that both use similar neural substrates and both are influenced by reward. I demonstrate how computational modeling can be used to explore similar and distinct neural mechanisms underlying cognitive functions. Specifically, I show that synaptic reverberation and reward-dependent plasticity are common neural mechanisms underlying both adaptive decision-making and attention. Computational models at different levels of complexity can provide powerful tools to investigate cognition, and link behavior to biophysical properties of the nervous system.

 
 
 

Karunesh Arora, Ph.D.
Department of Chemistry and Biophysics Program, University of Michigan
Enzyme Conformational Dynamics During Catalysis

Abstract: Deciphering how enzymes achieve enormous rate enhancements over similar uncatalyzed reactions in the solution is the holy grail of biochemistry. Currently, the most popular proposal for explaining the enzyme’s catalytic rate enhancement connects the enzyme's dynamics to its catalytic ability. It has been suggested that protein conformational fluctuations such as loop movements and domain motions preceding the chemical step in the enzyme’s catalytic cycle are crucial for enzyme catalysis. However, how exactly these protein motions affect the chemical transformations is controversial. Resolving this question is made more challenging by the fact that atomistic experimental methods only provide crucial information of ground state structures along the turnover cycle of enzymes but reveal less about the transition dynamics between these ground state structures. Since the rate of catalysis is determined by the ascent of the enzyme/substrate complex over the energy barriers between the ground state structures, it is crucial to determine their conformational transition pathways. I will discuss our efforts to understand the fundamental mechanism of enzyme catalysis through the development of advanced conformational sampling methodologies that can capture the long-time scale dynamics of enzymes. I will show how these computational tools have enabled a greater quantitative understanding of enzyme catalysis by providing access to functional enzyme dynamics during catalysis as well as the corresponding free-energy landscape that defines the relative probabilities of thermally accessible conformational states of enzyme/substrate complexes and the free energy barriers between them. In addition, I will show how this detailed atomic level description of functional enzyme dynamics is opening new avenues for the development of highly efficient inhibitors and the design of novel protein catalysts. I will conclude by describing a mechanistic view of enzyme catalysis emerging from our integrated computational studies of enzyme dynamics, which suggests that the conformational dynamics of enzymes is finely tuned to achieve reactive conformations for efficient catalysis.

 
 
 

Chaolin Zhang, Ph.D.
Rockefeller University
Global RNA regulatory networks in the mammalian brain: insights from an integrative systems biology approach

Abstract: Regulation of gene expression at the RNA level, including alternative splicing, polyadenylation, editing, transport and stability, plays critical roles in normal mammalian physiology, and is highly relevant to various human diseases, ranging from neurologic diseases to cancer. At the center of such regulation are interactions of several hundreds of RNA-binding proteins (RBPs) with their target transcripts, or RNA regulatory networks. Dissecting such networks, especially at the global scale, has been challenging, but recent advances in high-throughput technologies and their various applications are now opening new avenues to this problem. In this talk, Dr. Zhang will discuss the generation, integration and modeling of multiple types of heterogeneous data, including transcriptome profiles, maps of in vivo protein-RNA interactions, and various genomics and evolutionary features, to accurately infer comprehensive RNA-regulatory networks in the mammalian brain. Computational and experimental analysis of these networks demonstrates the potential of this strategy to formulate experimentally testable hypotheses, and yield insights into the complexity of RNA regulation, their functional organizations and potential connections to neuronal disorders.

 
 
 

Simon Gravel, Ph.D.
Stanford University
Inferring complex histories from genome-wide data

Abstract: Complex patterns of human demographic history dramatically shaped current genomic diversity. Detailed genetic models can therefore help understand how our species evolved, and are useful to interpret and optimize the design of association studies. I present tractable models of human dispersal and migration, from early modern humans in Africa to the recent era of intercontinental travel, and their impact on allele frequencies and linkage disequilibrium patterns. Applying the models to whole-genome data from the 1000 Genomes Project and Complete Genomics, as well as to a panel of recently admixed populations, I provide estimates for the timing and magnitude of different historic and prehistoric events, and discuss the impact of the observed fine-scale population structure on disease associations with rare variants.

 
 
 

Nathalie Pochet, Ph.D.
The Broad Institute of MIT and Harvard
Integrative genomics and genetics: from yeast evolution to human disease

Abstract: My aim is to develop innovative computational approaches for the integration and analysis of genomic data in order to gain mechanistic and functional insights in human disease. My work illustrates how integration of genomic data and studies in model microbial systems led to the development of new approaches that address both fundamental biological questions and catalyze studies in more complex organisms. In this talk, I will cover three case studies of human diseases where I developed strategies to discover disease mechanisms and therapeutic targets: (1) a comparative approach to deciphering regulatory programs in the malaria parasite; (2) the role of variation in repeat sequences: from yeast evolution to human disease; and (3) deciphering the molecular mechanisms underlying chronic lymphocytic leukemia.

 
 

Seminars: 2011/2012

Edward O’Brien, Ph.D.
University of Cambridge
Theoretical insights into co-translational folding: From pathways to proteomes

Abstract: Understanding protein folding and protein homeostasis in living cells requires we understand the concomitant folding of proteins during their biosynthesis by the ribosome molecular machine. I will discuss our efforts to understand the principles of such co-translational folding through the development of coarse-grained simulation force-fields, chemical kinetic modeling, and systems biology methods. These tools have allowed us to gain novel insights into fundamental issues in in vivo folding including the impact of variable translation rates, the effect of chaperones, and the co-translational folding properties of the E. coli proteome. These methods provide a framework for designing transcriptomes with desired co-translational folding properties.  

 
 
 

Robert Johnston Jr., Ph.D.
New York University
Controlling stochastic gene expression in the Drosophila retina

Abstract: Development requires specific gene regulatory programs to generate reproducible body plans that have been selected throughout evolution. To this end, genes are typically expressed in uniform or regionalized patterns in cells of specific tissues. However, stochastic patterning is sometimes utilized to diversify cell fates, particularly in nervous systems. Similar to the human color vision system, the fly eye expresses several light-detecting Rhodopsin proteins that are stochastically distributed. A complex interlocked feedforward loop network motif determines cell-type specific expression of Rhodopsins. This motif is finely tuned to ensure repression or activation outcomes. The stochastically-expressed Spineless transcription factor is the critical trigger for this network motif. Controlled by an unusual cis-regulatory logic, Spineless is activated in all cells by an enhancer and repressed in a random subset of cells by the combination of several repressor elements.

 
 
 

Benjamin Greenbaum, Ph.D.
Institute for Advanced Study
Simons Center for Systems Biology, School of Natural Sciences Princeton University
Host Pathogen Co-evolution: RNA Viruses & Innate Immunity

Abstract: When a rapidly evolving RNA virus changes its host, its evolution can reveal new information about host biology. Highly pathogenic influenza in humans has been associated with an avian virus entering the human population and initiating an aberrant innate immune response. I will discuss a novel statistical approach which reveals that nucleotide motifs found in avian strains are avoided by strains evolving in humans. Moreover, host innate immune genes avoid many of the same signals, implying a new way to classify antiviral innate immune genes. This work led to the prediction, recently validated, that these "non-self" signals, highly present in avian viruses, stimulate the human innate immune response and and may contribute to its overstimulation by avian strains . Finally, I will discuss some other statistical approaches to host-virus problems.

 
 
 

Eric Smith, Ph.D.
Santa Fe Institute
Biochemical modularity and the logic underlying the emergence and long-range evolution of metabolism

Abstract: Structured systems of constraints are an inherent and essential part of evolutionary dynamics. The emergence and early evolution of life was constrained by the geochemical context of the Hadean earth, and apparently by limitations in the local functional groups and reaction mechanisms that could be accessed within organic chemistry in interactions with transition metals. Reconstruction of the early evolutionary history of carbon fixation likewise suggests strong constraints: very little innovation has ever occurred in autotrophic fixation pathways; all known innovations appear to have been accomplished very early within the diversification of bacteria and archaea; and the evolution of carbon fixation has been structured by a small number of responses of a starkly modular network architecture to a few pressures from energy optimization, oxygen toxicity, redox potential, and alkalinity. Moreover, the emerging control over these innovations through cofactors may be speculatively (but sensibly) placed within the elaboration of metabolism itself. A simple logic governing the very long-range evolutionary dynamics of metabolism is suggested: constraints and module boundaries intelligible in geochemical terms continued to determine metabolic phenotypes through the entire history of life, despite the fact that most of the hierarchical control apparatus nominally responsible for evolution was accreted after the major innovations had ceased.

 
 
 


Simon DeDeo, Ph.D.
Santa Fe Institute
Natural Computation in Social and Biological Systems

Abstract: Understanding how evolved -- as opposed to engineered – systems perform computations and process information is a common theme in the study of biological systems. But how can we gather and analyze data from biological systems in ways that connect to the mathematical objects of a particular theory? We present three analytic methods, drawn from mathematics and computer science (noisy logic; the group-theoretic decomposition of finite state automata), economics (inductive game theory), and physics (spectral analysis of fluctuations.) We then show how these play out in a ongoing collaborative, empirical study of the highly-resolved social dynamics of a primate social system.

 
 
 

Sushmita Roy, Ph.D.
Broad Institute of Harvard and MIT
Comparative Inference and Analysis of Regulatory Networks: From Yeast to Fly

Abstract: Transcriptional networks specify the set of connections between transcription factors and target genes, and drive cellular decisions under different spatial and temporal contexts. I will present computational methods for understanding regulatory networks in two very different temporal contexts, a developmental context, and an evolutionary context. In the first part of my talk, I will present an approach to infer the regulatory network for the fly, Drosophila melanogaster by integrating diverse datasets including ChIP binding, motif instances, expression and chromatin. We used the regulatory network structure to predict expression and process annotation of target genes, demonstrating the functional value of the connections of the network.
In the second part of my talk, I will present a novel multi-species clustering algorithm, Arboretum that can be used to infer expression clusters of multiple organisms. Arboretum 
 incorporates phylogenetic information encoded in both gene and species trees to identify clusters of co-expressed genes across multiple species, as well as the cluster assignment
 in the ancestral species. We applied our approach to transcriptional response data of 15 ascomycete yeast species as they run out of glucose. We identified five main expression
clusters that are conserved across the 15 species, and are associated with biological processes and cis-regulatory elements that are consistent with the global glucose starvation
experienced by these cells. We analyzed sets of genes with similar phylogenetic histories and identified several groups that are associated with life-style specific adaptations, 
including the convergent evolution of mitochondrial genes. Finally, we used Arboretum to study gene duplications and found that although duplicates do not change the overall
expression patterns, they enable more divergence in clusters across species. Overall, these results are consistent with known information of transcriptional response under glucose
starvation, and provide new insights into the evolutionary patterns of duplicates and previously unstudied groups of genes.  

 
 
 

Jessica Mar, Ph.D.
Dana-Farber Cancer Institute
Modeling Cell Fate Transitions and a Variance-Based Approach to Studying Human Disease

Abstract: In studying biological systems, there are two areas that represent future clinical potential – transitions and variance. Understanding how cells transition from one fate to another is a fundamental problem in biology, with implications for better understanding evolution, development and the etiology of human disease. I will describe how gene expression trajectory models that chart cellular transitions in gene expression state space have proven useful for identifying the biological processes driving these types of transitions.
Throughout the history of biological science, almost all experiments have focused exclusively on comparing average signals of different phenotypic groups. But increasingly there is evidence that variation may play an equally important role in determining cellular phenotypes. Using olfactory stem cells derived from patients suffering from Schizophrenia, Parkinson’s disease, and a healthy control group, we find marked differences in expression variance in cell signaling pathways that shed new light on potential mechanisms associated with these diverse neurological disorders.

 
 
 

Jeff Gore, Ph.D.
Department of Physics, Massachusetts Institute of Technology
Cooperation and cheating in microbes

Abstract: Understanding the cooperative and competitive dynamics within and between species is a central challenge in evolutionary biology. Microbial model systems represent a unique opportunity to experimentally test fundamental theories regarding the evolution of cooperative behaviors. In this talk, I will describe recent experiments probing the cooperative growth of yeast in sucrose and the cooperative inactivation of antibiotics by bacteria. In both cases we find that cheater strains—which don’t contribute to the public welfare—are able to take advantage of the cooperator strains. However, this ability of cheaters to out-compete cooperators occurs only when cheaters are present at low frequency, thus leading to steady-state coexistence. These microbial experiments provide fresh insight into the evolutionary origin of cooperation. In addition, the challenges of maintaining cooperation in a population may have implications for clinically important microbial behaviors such as antibiotic resistance.

 
 
 

Vladimir jojic, Ph.D.
Stanford University
A model of differential regulation in immune cell development and disease progression

Abstract: I will describe a novel model aimed at uncovering differential regulation in related cell types. This model builds on a body of work in module network reconstruction and extends it by permitting within module regulatory program variations. Changes in regulatory programs underlying differentiation, development, and disease progression are examplevs of such variation.
The model has three distinct advantages over other regulatory program recovery approaches. First, by leveraging the module framework it allows for sharing of regulatory programs between genes and allows fitting independent programs for each of the modules. Second, it permits regulatory program changes at any point in development, while at the same time promoting regulatory program conservation in related cell types. Third, module-level regulatory program recovery is a convex problem and hence concerns about local minima are eliminated.
I will demonstrate the utility of the new model by analyzing ImmGen compendium data. This data consists of gene expression measurements from 214 distinct cell types of the mouse immune system. In this analysis, the model leverages a differentiation tree to encourage conservation of the regulatory programs between parent and daughter cells. Thus regulatory programs preserved throughout a particular lineage are recovered, in addition to deviations specific to sublineages.

 
 
 

Scott Roy, Ph.D.
Department of Biology, Stanford University
Towards an Integrative View of Eukaryotic Gene Structures

Abstract: I will present a comparative genomic perspective on the evolution and function of eukaryotic gene structures. Nuclear genes are interrupted by spliceosomal introns, quasirandom sequences that are removed from RNA transcripts. Spliceosomal introns are ubiquitous features of eukaryotic genomes: the average human gene contains nearly ten introns, and 95% of multi-exon human genes are alternatively spliced. We used a multidisciplinary approach to study the spliceosomal system at several levels. First, we reconstructed the evolution of intron sequences, and of intron loss and gain, through eukaryotic history. These studies showed that early eukaryotic ancestors had complex intron-exon structures, and that eukaryotic evolution has been characterized by recurrent simplification of gene structures. Second, to understand the locus-specific effects of splicing on gene function, we performed many-species studies of individual animal gene families.
These studies revealed strikingly parallel evolution of intron-exon structures, and elucidated the very different roles for intron-exon structure at different loci. Third, to understand the evolution of splicing regulatory networks, we studied the vertebrate CNS-specific splicing regulator Nova. We used a multidisciplinary approach to reconstruct the step-by-step evolutionary assembly of the Nova splicing network through animal history, revealing large amounts of change, and remarkable plasticity of Nova function. Finally, I will discuss future research plans, including population genomics of gene structure in novel model organisms, genome-wide scans for deeply conserved intron functions in animals, and probing the structure and evolution of splicing networks. In total, these studies elucidate the ‘rules’ of intron-exon structures and evolution, providing a foundation for an integrative view of eukaryotic gene structures.

 
 
 

Roman Yukilevich, Ph.D.
Department of Ecology and Evolution, The University of Chicago
The role of complex genetic architecture in the study of adaptation and speciation

Abstract: Deciphering the genetic and evolutionary mechanisms of complex traits in nature is fundamental to evolutionary theory and bio-medical research. In this talk, I will first discuss how recent technological and experimental approaches are challenging our previous assumptions about the genetic basis of adaptation and speciation. Increasingly we are realizing that most traits relevant to medicine and evolutionary biology have a complex and polygenic basis and that studying these phenomena in nature requires a new synthetic approach that combines theory, genomics and phenotypic assays. I will then present some of my recent work on incorporating multi-locus epistatic genetic networks into the theory of adaptation and speciation and what we are now learning about the genetics of rapidly evolving complex behaviors responsible for speciation in nature.

 
 
 

Jian Lu, Ph.D.
Molecular Biology & Genetics, Cornell University
Function and evolution of small RNA targeting

Absract: MicroRNAs are small non-coding RNAs that regulate expression of target genes at the post-transcriptional level. A microRNA can target more than 100 genes and one gene can in turn be regulated by multiple microRNAs. Given the broad interactions between microRNAs and their targets, the origin and evolution of such a system is intriguing.
In my seminar, I will first illustrate the birth and death process of microRNAs by showing that most newly emerged microRNAs are evolutionarily transient. Next I will discuss the impacts of microRNA targeting on natural variation in human gene expression. Then I will talk about the relation between aberrant regulation of microRNAs and cancer. Finally, I will briefly describe my research on population genomics of piwi-interacting RNAs (piRNAs), another class of small RNAs.

 
 
 

Sagi Shapira, Ph.D.
Massachusetts General Hospital and Broad Institute of MIT and Harvard University
A physical and regulatory map of host-influenza interactions reveals new critical pathways in influenza infection

Abstract: The dependence of influenza on cellular processes - such as endocytosis, transcription and capping, nuclear import and export, protein translation and secretion - indicates that many indirect and some direct host factors are required for successful replication of the virus. Host factors are also able to detect viruses and activate anti-viral host defenses that limit viral replication. In an effort to characterize host and viral factors that directly or indirectly participate in these processes, we initiated and developed a research program that includes: (i) the identification of a human protein network that physically interacts with the ten influenza proteins (ii) decomposing the host cellular transcriptional responses to viral infection (iii) functional assays to determine the role of cellular and viral factors in mediating immune responses to infection or controlling viral replication (iv) a close collaboration with computational biologists to analyze and integrate the data. Together, these approaches led to the identification of several hundred factors that protect cells from influenza or represent human susceptibility factors to influenza virus, and to construct a regulatory and functional model of this pathogen-host interaction. Ongoing and future efforts are aimed at developing general strategies for studying dynamic host-pathogen interactions. A mechanistic understanding of these relationships will provide important insights into cellular machinery that control basic cell biology and will have broad implications in human translational immunology and infectious disease research.

 
 
 

Michael Chevalier, Ph.D.
Postdoctoral Scholar, UCSF
Department of Biochemistry and Biophysics
Quantifying stochastic, spatial and temporal effects in biological signaling pathways

Abstract: Advancements in measurement and imaging techniques now allow one to probe single cells and observe spatial-temporal dynamics of mRNAs and proteins. A need for quantifying these observations has led biology to import analysis tools and researchers from other quantitative disciplines. A given tool, depending on its underlying rules and assumptions, will usually require a certain level of adaptation to analyze biological systems. Inspired from recent dynamic measurements of the unfolded protein response (UPR) in yeast and exocytosis on dendrites, I will discuss some new quantitative biological modeling tools which I developed. They address the multi-scale, stochastic nature of biochemical systems and general 3D spatial-temporal processes in biology such as diffusion on complex membranes. In addition, I will discuss new scientific results. First, through data analysis and modeling of the UPR, I will demonstrate how the chaperone BiP modulates the master signal transducer Ire1. Second, in an effort to explore the condition-based spatial-temporal regulation of membrane receptors on neurons, I will present recent data and 3D spatial-temporal modeling results of exocytic events on neurons. A current controversy in neuroscience is how new signaling receptors are brought onto spines: 1) direct spinal insertion or 2) insertion outside the spine followed by diffusion onto the spine. Through data analysis and modeling I will demonstrate 2) to be a frequent and viable mechanism while also demonstrating that 1) occurs at very low frequencies.

 
 

Seminars: 2010/2011

Sven Sahle, Ph.D.
University of Heidelberg, Germany
Dealing with computational models where most parameter values are unknown - a global optimization approach

Abstract: When developing quantitative models for a systems biology approach usually some, or even most of the parameter values are not well known. There are several ways to deal with this problem, including sensitivity analysis and parameter estimation. Here I will present an approach based on global optimization algorithms that allows to study biologically relevant properties of models even if the knowledge of numerical parameters is very limited.

 
 
 

David J. Lynn, Ph.D.
Teagasc, Irish Agriculture and Food Development Authority
Facilitating Systems-level Analyses of the Innate Immune Response to Human and Animal Pathogens

Abstract: The immune response does not involve simple linear pathways but rather complex inter-connected networks of interactions, regulatory loops and multifaceted transcriptional responses. InnateDB (www.innatedb.com) is one of the first databases and integrated analysis platforms specifically designed to facilitate systems-level analyses of the mammalian immune response and is one of the most comprehensive databases of all human and mouse molecular interactions (115,000+) and pathways (3,000+). Building upon this, more than 14,000 innate immunity-relevant interactions have now been contextually annotated through detailed review of the literature providing novel insight into the innate immunity interactome. Integrated bioinformatics solutions include the ability to investigate user-supplied quantitative data in a network and pathway context using pathway, ontology and transcription factor over-representation analyses, and network visualisation and analysis tools. InnateDB is a core component of our Grand Challenges in Global Health Initiative project to investigate the host response to a range of important human pathogens including Non-Typhoidal Salmonella, Typhoid, Dengue, Malaria, Tuberculosis, HIV and other infections. We are gaining new insight into the common and alternative biological processes, pathways, and regulatory networks, that are involved in the host response to each infection and how host defence peptides modulate these responses. Infectious disease also represents one of the most significant threats to animal health. InnateDB is now being expanded to enable systems biology approaches to animal health genomics. As part of this development, we are implementing a variety of computational methods to infer bovine protein-protein interactions, transcriptional networks and pathways, providing novel insight into the bovine interactome.

 
 
 

Charles F. Stevens, M.D., Ph.D.
The Salk Institute
A Universal Principle Governing the Design of Neural Networks

Abstract: For evolution to work, neural circuits must have a scalable architecture – that is, these circuits must be designed in a way that permits them to process more information by simply increasing the circuit size rather that redesigning it. I will describe one of the design principles that govern such scalable circuits in the vertebrate brain. I start with a consideration of the retina. Retinal ganglion cells (RGCs) sample information about the visual scene and encode this information for transmission to the brain. Because RGCs tile the retina, each RGC constitutes a ‘pixel’. How large should this pixel be? Using a theoretical approach, I give an answer to this question. These same ideas can be extended to a give general design principle (supported by experimental evidence) that applies to circuits throughout the vertebrate brain.

 
 
 

Feng Ding, Ph.D.
University of North Carolina at Chapel Hill
Multiscale modeling of proteins and the applications in protein folding, misfolding, and engineering

Abstract: Molecular modeling of proteins and protein complexes is crucial in our understanding of biology by bridging gaps of time and length scales between experimental observations and underlying protein systems. One of the major challenges in the computational and structural biology is to effectively model the molecular system and to efficiently sample the biologically relevant time and length scales. Over the years, we have developed efficient protein conformation and sequence sampling algorithms and multiscale models of proteins. We have successfully applied these methods to study protein folding, protein misfolding and aggregation, protein engineering and protein design. Many of these works were achieved by close collaborations with experimental groups, where computational studies provided experimentally-testable hypothesis and experimental results validated these predictions. Driven by applications of biologically important problems, my long-term goal is to develop a multiscale computational modeling framework to model the structure, dynamics, and function of molecules and molecular complexes.

 
 
 

Christopher Snow, Ph.D.
California Institute of Technology
Empirical and Computational Models for Protein Design

Abstract: Nature recombines proteins to create new sequences with desirable properties. We emulate this mechanism in the laboratory to study protein biophysics and to engineer enhanced enzymes. Structure-guided recombination of homologous proteins generates diverse sequences which still have a high probability of retaining the parental fold and function. For example, we have constructed a synthetic family of cytochrome P450 heme domains wherein the average “chimera” differs from the nearest natural “parent” sequence by 72 mutations. As a scientific strategy, recombination yields synthetic protein families that explore what is physically possible versus what is merely biologically relevant. Remarkably, the properties of protein chimeras may be approximated in terms of 1-body contributions from the recombined sequence blocks. Regression analysis results in predictive empirical models for protein stability. We have used such models to engineer synthetic cellulase enzymes with enhanced stability. A key stabilizing mutation was discovered by recombination, highlighting the novel possibilities of multi-scale enzymology. Another common goal for protein design is to alter specificity. For example, we are redesigning the interface between the alpha and beta chains of the human T-cell receptor (TCR) to develop an orthogonal TCR with gene therapy applications. Computational models successfully predicted double mutants with reduced pairing to wild-type TCR. For this task we created a custom design algorithm using SHARPEN, an open-source molecular modeling platform that we have developed to allow facile algorithm design.

 
 
 

Pavel Kraikivski, Ph.D.
University of Connecticut
Models of Cytoskeletal Dynamics: gaining new insights and making testable predictions

Abstract: Filamentous actin and microtubules are two key components of cytoskeleton that determine cell shape and play important role in cell motility and intracellular transport. Recent results of modeling analysis of actin dendritic nucleation and effects of microtubule dynamics on intracellular transport will be used to demonstrate how models help analyze raw experimental data, perform “thought” experiments, and even provide guidance for designing wet experiments. First, I will discuss a problem of quantifying a reaction pathway. While progress in uncovering reaction networks makes quantitative identification of protein-interaction pathways a reachable goal, direct measurements of rate constants are not always feasible and parameters are often inferred from multiple pieces of data by means of modeling. Success of this approach relies on sufficiency of available experimental data for unique parameterization of the network. I will demonstrate, in the context of actin dendritic nucleation, how this problem can be analyzed using a concept of rate-limiting step. In the second part of my talk, I will describe a “search-and- capture” model that explains experimental observations of pigment redistribution in Xenopus melanophores, without resorting to parameter fitting. It shows that the capture of granules by the microtubule tips and the subsequent transport along the microtubules by molecular motors are sufficient for accumulating pigment granules at the cell center on experimentally observed timescales. The model is then used to perform thought experiments that help evaluate contributions of individual factors which could not be isolated experimentally.

 
 
 

Daphne Koller, Ph.D.
Stanford University
Understanding gene regulation: Networks and perturbations

Abstract: The cell is a complex network of interconnected entities. The activity of cellular networks this network is modified by multiple internal and external perturbations. This talk describes computational methods that utilize high-throughput data and statistical models to reconstruct these networks and understand how their activity is modified by various factors. I will show how gene expression data from a population of genetically diverse individuals can be used to uncover genetic mechanisms that cause phenotypic diversity, and how gene expression from diverse immune system cells can be used to uncover the mechanisms that regulate hematopoietic cell differentiation.

 
 

Seminars: 2009/2010

David Botstein, Ph.D.
Princeton University
Coordination of Gene Expression, Metabolism, Cell Division and Growth Rate in Yeast

Timothy Hughes, Ph.D.
University of Toronto
Mapping the protein-nucleic acid interactome

Abstract: One of the major impediments to our understanding of genome function is our incomplete knowledge of how proteins interact with nucleic acid. Current work in my laboratory is aimed at cataloguing descriptions of protein interactions with DNA and RNA in vitro on a genomic scale, in order to enable building models that explain how cells recognize genomic features, how gene regulatory networks are organized, and how genomes evolve.

 
 
 

Arnold Levine, Ph.D.
Institute for Advanced Study
The Forces that Drive theEvolution of Influenza Viruses

Abstract: Influenza A viruses are unusual because they are able to infect their hosts each year and occasionally virus strains appear that sweep the world in a pandemic infecting their hosts who demonstrate no immunity. These viruses have three properties that permit them to circumvent our immune system; first they have a high mutation rate changing their RNA sequences at a frequency of one per 10,000 bases or one mutation per genome. That means every virus produced can differ from its parent. Second the virus contains eight chromosomes and mixed infections of viruses in the same host cell provide viruses that have recombined their chromosomes producing up to 256 different combinations. Third,these viruses replicate in many diverse hosts such as birds, swine, humans,horses etc. Virus replication in different hosts place very different selection pressures upon the genome sequences of these viruses because the immune system differs between different hosts. Based upon this we have explored whether we could classify a virus as coming from replicating in a bird or a human host simply based upon its nucleotide sequences the RNA chromosomes. Bird derived viruses have a sequence distribution with a high U (uracil) content while bird viruses that have replicated in humans over a ninety year period progressively lower their U-content. RNA viruses that have replicated in humans from thousands of years have a very low U-content.Examining the di-nucleotide content of bird derived Influenza viruses demonstrates a high content of the di-nucleotide CpG and a lowering of this di-nucleotide content in human derived viruses by mutation to UpG. In fact all primate RNA viruses have minimized their CpG content in contrast to RNA viruses that replicate in plants, bacteria or lower organisms without a robust innate immune system. This selection pressure against CpG content in RNA viruses comes about because this RNA di-nucleotide is recognized by the Toll-7 cellular receptor which activates the transcription of a large number of genes in the innate immune system of a cell. This produces large levels of interferon which inhibits the replication of the virus. The Influenza viruses that accumulate a higher distribution of UpG therefore replicate better in human hosts and this is selected for by many rounds of virus replication. The innate immune system produces hundreds of proteins that fight infection and in excess produce severe symptoms of distress in the host. High levels of innate immune responses produce a cytokine storm which disables the host. For that reason the genes that make up the innate immune system and are expressed during a virus infection all have very low CpG contents to avoid the presence of a positive feedback loop. That is indeed the case, the genes of the innate immune system are among the lowest in CpG content in our genome and that too has been selected for over time. This is why a bird influenza A virus with a high CpGcontent entering the human population can have such severe set of systems and a high death rate. The swine influenza A virus that is presently the pandemic strain circling the earth has a moderate CpG content as dictated by its origins from several viruses mixing in pigs and its replication in pigs for several years.

 
 
 

John Pepper, Ph.D.
The University of Arizona
Multilevel Selection and the First General Theory of Cancer

Abstract: Multilevel selection theory applies to many important problems in biology, including applications outside the traditional boundaries of evolutionary biology. my colleagues and I have applied multilevel selection theory to developing and testing the first general theory for cancer biology. The key characteristics of cancer are shaped by two distinct but interacting levels of selection: A history of selection among individuals has shaped human defenses against, and vulnerabilities to, cancer. On a smaller scale, within each body and each life span, somatic cells also meet the conditions for evolution by Darwinian selection: cell reproduction with heritable variation that affects cell survival and replication. Consequently, somatic selection among cells favors the dismantling of normal genetic constraints on cell proliferation and survival. This eventually results in uncontrolled cell proliferation followed by malignant tissue invasion. By illuminating its underlying causal dynamics, evolutionary theory provides a general framework for understanding many aspects of cancer. Several conceptual and analytical tools from evolutionary biology can be applied directly to cancer biology, including somatic phylogenetic reconstruction of cancer cells, and the analysis of cellular adaptation and convergent evolution. This theory has important implications both for cancer research and for cancer medicine.

 
 
 

Daniel Promislow, Ph.D.
The University of Georgia
A Network Perspective on the Evolution of Aging

Abstract: We have long recognized that the strength of selection declines with age. This notion lies at the heart of classic evolutionary genetic theories of aging. However, in the last few years, theoretical and molecular genetic studies have led researchers to question the usefulness of this theory. Recent work in systems biology, and in particular, work on the structure and function of biological networks, suggests a way forward that can integrate evolutionary and molecular approaches in the study of aging. Here I will suggest that a network approach to the study of aging can provide valuable empirical insight into the mechanisms that slow or accelerate rates of aging. Furthermore, such an approach might also serve as the framework for new and more predictive model, built upon but extending existing theory.

 
 

Recruitment Seminars: 2010

Eric Batchelor, Ph.D.
Harvard University
The ups and downs of p53: Analysis of p53 dynamics in response to DNA damage

Abstract: The tumor suppressor protein p53 is a transcription factor that plays a major role in maintaining genomic integrity by regulating cellular stress responses. p53 is activated in response to several different kinds of stress, including gamma radiation, which causes DNA double strand breaks, and UV radiation, which leads to exposure of single-stranded DNA and stalled replication forks. Previous work in the Lahav lab showed that p53 levels undergo uniform pulses in response to gamma radiation. The amplitude and duration of individual pulses are constant, but the number of pulses seen in individual cells varies from cell to cell. In this talk, I will discuss my work in identifying the mechanism that regulates p53 pulses; namely, that the pulses are the result of two negative feedbacks mediated by the E3 ubiquitin ligase Mdm2 and the phosphatase Wip1. I will also discuss recent single-cell work showing that p53 can undergo dynamics that are distinct from uniform pulses when cells are challenged by UV radiation. These results suggest that p53 dynamics may encode information about the nature of cellular stresses, and may indicate a previously under-appreciated role for p53 dynamics in the regulation of cell fate.

 
 
 

Jeffrey Chang, Ph.D.
Duke University
Genomic Strategies to Decipher the Complexity of Cancer

Abstract: The extreme heterogeneity seen in the rate of progression of cancer and its responsiveness to treatment is reflected in the complexity of the underlying biology. A model is now well established where genetic lesions disrupt the normal function of signaling pathways, and diverse combinations of those aberrant activities lead to tumorigenesis. However, the canonical paradigm of signaling as compartmentalized sequential pathways has thus far not been able to explain the heterogeneity seen in the cancer phenotype. A wealth of data reveals significant crosstalk among pathways, forming structures that are better described as networks. To acquire a deeper understanding of this network structure, I developed an approach to deconstruct pathways into "modules" represented by gene expression signatures. I confirm that they represent units of underlying biological activity linked to known biochemical pathway structures. Importantly, I show that these signaling modules provide tools to dissect the complexity of oncogenic states that define disease outcomes as well as response to pathway-specific therapeutics. Finally, I will discuss future plans to leverage these modules to deconstruct the entire signaling network into functional units, forming the basis of an assay that can comprehensively and simultaneously measure functional activity across the network. I propose that this model of pathway structure constitutes a framework to study the processes by which information propagates through cellular networks, and to elucidate the relationships between molecular activities and clinical phenotypes.

 
 
 

Benjamin Greenbaum, Ph.D.
Institute for Advanced Study
Using Influenza to Probe Innate Immunity

Abstract: Influenza virus infects multiple species and has a high evolutionary rate. This gives the virus an opportunity to encode host specific patterns within its genome. In this talk I will discuss how this feature can be used to explore the evolution of the virus and its host interactions. We show where the recent pandemic likely evolved during periods where no sequences were found. We also found "non-self" sequences the virus avoids in mammals. These sequences were tested and found to have direct consequences for innate immune signaling, along with allowing us to classify the innate immune system in a new way.

 
 
 

Ruth Hershberg, Ph.D.
Stanford University
Disentangling the deterministic from the stochastic in evolution: Insights from the study of bacterial pathogens

Abstract: Evolution is driven by a combination of deterministic and stochastic processes: Mutation that drives evolution forward by generating variability, is a random process that nevertheless occurs according to certain deterministic biases. Once mutations occur their fates are influenced by stochastic processes such as genetic drift. At the same time, mutations are also filtered in a deterministic manner by selection based on their fitness effects. Biases in the outcomes of the evolutionary process are the result of a complex combination of mutational biases and selection. Organisms with reduced effective population sizes evolve under selection that is severely inefficient relative to stochastic forces and provide a unique opportunity to study effects of mutational biases and natural selection in isolation. In this talk I will demonstrate that even though bacteria are generally assumed to have extremely large population sizes and are thus expected to be subject to very efficient natural selection, several important bacterial pathogens, such as Mycobacterium tuberculosis, Salmonellatyphi, Bacillus anthracis and Yersinia pestis are in fact evolving under extremely inefficient selection. I will show that this leads to phenotypic consequences, such as an accumulation of functional point mutations across their genomes and accelerated gene loss that are potentially important from both the evolutionary and medical perspectives. I will then describe how we used sequence data of such pathogens to probe mutational biases and demonstrate that mutation is universally biased towards AT in bacteria. This finding contradicts the long held view that mutational biases are highly variable among bacteria and suggests that nucleotide content may be a selected trait.

 
 
 

Daniel Larson, Ph.D.
Albert Einstein College of Medicine
Systems Biology in Single Cells: Measurement and Modulation of Transcription Dynamics in vivo

Abstract: Expression levels of individual genes are determined by the balance between production and degradation of RNA. The synthesis of RNA is mediated by a single class of enzymes, the RNA polymerases, which are highly regulated but also subject to stochastic fluctuations. In this talk, I will describe quantitative models of gene expression in humans and yeast which can be deduced by observing the activity of RNA polymerase II on single genes. This approach utilizes a battery of computational and experimental biophysical techniques – including single molecule imaging, numerical modeling, fluctuation analysis, photomanipulation, and bioinformatics – to elucidate how RNA polymerase is regulated throughout the kinetic process of transcription (from promoter clearance to elongation to termination) and how the precision of this control is ultimately limited by intrinsic molecular fluctuations. In yeast, we present a model where RNA production is determined by the search time of limiting factors in the nucleus, modulated by the static positioning of nucleosomes. Some genes, such as the cell-cycle gene POL1, demonstrate diffusion-limited production rates, while other genes such as MDN1 are well below this limit. For a subset of genes, this model explains both the mean and variation for every step of the central dogma: from DNA to RNA to protein. In human carcinoma cells, we developed a gene which can be precisely controlled with light activation (gene “uncaging”), and we demonstrate how dynamics at the promoter can influence expression steps far downstream of promoter clearance. For all these studies, we are able to extract quantitative information on RNA polymerase kinetics using a novel method of fluctuation analysis for observing the low signal-to-noise events inherent in single molecule imaging. The ultimate goal of these studies is to develop both a theoretical and experimental arsenal for understanding and manipulating gene expression at the cellular level.

 
 
 

Bernardo Lemos, Ph.D.
Harvard University
Manifold determinants of gene regulation

Abstract: A key challenge remains to learn how attributes such as expression level, protein function, tissue-specificity, genome organization, aspects of cis and trans regulation, and a multitude of other systems level and gene level attributes interact. Another complementary challenge is to learn how mutation, genetic drift, and natural selection shape variation in these attributes and, ultimately, patterns of regulatory variation between individuals, populations, and species. In this context, the role of heterochromatin and the functional consequences of variation therein have mostly been neglected. We have recently discovered that natural polymorphic variation within highly heterochromatic Y-chromosomes result in the modulation of gene expression at many loci located in the autosomes and at the X-chromosome. Here we will further develop the notion that polymorphism in the lengths and kinds of heterochromatic sequences contribute to global chromatin regulation, with functional consequences to both medically and ecologically relevant phenotypes. These will be tied up with parallel studies aimed at understanding modes of gene expression inheritance, and the contribution of cis and trans regulation to gene expression diversity.

 
 
 

Mihaela Pavlicev, Ph.D.
University of Oslo, Norway; Washington University, St. Louis
Variation in pleiotropy: evolving complex organisms

Abstract: One of the most intriguing questions in evolutionary biology is how the complex organisms evolve, if the raw material is generated by blind mutations. A powerful hint lies in the observation that heritable variation in a population is non-randomly distributed. Thus the variation introduced by mutation that is random with respect to selective needs, maps to the phenotype to produce patterned variation. One of the main mechanisms structuring the genotype-phenotype mapping (GP map) is pleiotropy, where genes affect multiple traits and cause their correlation. For example, limb bones in mammals are highly covarying structures and when selected upon individually, manifest correlated response. Correlations are present at any level of organization, and can manifest themselves as side-effects of treatments, or multiple-drug resistance. The lack of pleiotropy, on the other hand enables variational autonomy and allows parts to evolve and differentiate quasi independently. The existence of a structured variation raises the same question at the next level: how can the structure of genotype-phenotype map itself evolve?
In my talk I will address this question with focus on pleiotropy. In the first part I will show that we can detect variation in pleiotropy by QTL-mapping approach. Variation in pleiotropy is indicative of its evolution. I will present several well-documented examples of variation in pleiotropy. In the final part of the talk I will turn to the question whether this kind of variation can be used by selection? I will suggest a theoretical model of selection on variation in pleiotropy, resulting in a change in the association of traits, aligning the variation with the direction of selection.

 
 
 

Jon Wilkins, Ph.D.
Santa Fe Institute
Genomic imprinting and conflict-induced decanalization

Abstract: Genomic imprinting, where an allele’s expression pattern depends on its parental origin, is associated with asymmetric effects of natural selection on maternally and paternally derived alleles. Pairs of oppositely imprinted genes (one maternally expressed and one paternally expressed) affecting the same phenotype are expected to engage in an arms race with respect to gene expression level. Increased expression is strongly correlated with increased expression variance, which can translate into increased phenotypic variance. Thus, antagonistic coevolution of imprinted genes can lead to phenotypic decanalization. Analysis of a simple model makes quantitative predictions about the degree of decanalization, which is a function of the magnitude of the evolutionary conflict between maternally and paternally expressed genes, and the relationship between mean gene expression level and expression variance. Canalization modifiers, which reduce the variance in gene expression for a particular mean, are selectively favored, and reduce phenotypic variance in the short term. However, fixation of these modifiers leads to further escalation, with a net effect of increasing the decanalization of the phenotype. Implications of the model are discussed both in terms of growth effects and in terms of elevated frequencies of certain major psychiatric disorders.

 
 

Seminars: 2008/2009

Charles Boone, Ph.D.
Professor, University of Toronto
Banting and Best Department of Medical Research
Global Mapping of Genetic and Chemical-Genetic Networks in Yeast
 

Abstract: Synthetic Genetic Array (SGA) analysis automates yeast genetics, enabling a numberof different large-scale/systematic studies. We are attempting to generate the complete synthetic genetic interaction map for yeast. This map groups genes into functional modules and identifies pathways that work together to control essential processes, providing a wiring diagram of the cell. Because a gene deletion mutation provides a model for the effect a target-specific inhibitor, the genetic network also provides a key for interpreting chemical-genetic interaction profiles, linking bioactive compounds to their cellular targets.

 
 
 

Andrea Califano, Ph.D.
Professor, Columbia University
Department of Biomedical Informatics
A Systems Biology Analysis of Master Regulators of Physiological and Oncogenic Processes in Human B cells

Abstract: Physiological and pathological phenotypes are associated with distinct molecular signatures (e.g. gene expression profiles). While substantial research has been devoted to the analysis of these signatures, very little is known about the key
regulators that implement them in the cell. By compiling comprehensive, biochemically validated mas of molecular interactions in Human B cells, it is for the first time possible to investigate the mechanisms regulating physiological and oncogenic
processes in completely unbiased, genome-wide fashion. This is helping elucidate the master regulator genes associated with the presentation of normal and pathologic phenotypes and mechanism of action of individual drugs in distinct cellular contexts.

Bio: Dr. Andrea Califano received a Ph.D. in Physics on the study of the chaotic behavior in high-dimensional dynamical sysstems from the University of Florence, Italy, in 1985. In 1990 Dr. Califano started the IBM research initiative in Computational Biology, which culminated with the creation of the IBM research initiative in Computational Biology Center in 1997, a worldwide organization that he directed until his departure. In 2003, he was appointed Professor of Biomedical Informatics at Columbia University, where he is currently the lead PI of the Center for the Multiscale Analysis of Genetic Networks (MAGNet), Associate Director for Bioinformatics of the Herber Irving Comprehensive Cancer Center (HICCC), and co-Director of the Center for Computational Biology and Bioinformatics (C2B2). Since 1998 he has been especially active in the development of integrative methodologies for the dissection of cancer phenotypes. His lab has pioneered a wide range of methodologies for the reverse engineering and biochemical validation of genome-wide gene regulatory networks in human cells - including transcriptional, post-transcriptional and post-translational interactions - and their use for the dissection of physiologic and pathologic phenotypes.

 
 
 

Leah Elizabeth Cowen, Ph.D.
Canada Research Chair in Microbial Genomics & Infectious Disease
Assistant Professor, University of Toronto
Department of Molecular Genetics
Cellular Stress, Signaling, and Evolution: Hsp90's Role in Fungal Drug Resistance

Abstract: The emergence of drug resistance in pathogenic microbes provides a poignant example of microbial evolution with profound consequences for human health. The widespread use of antimicrobial drugs in medicine and agriculture exerts strong selection for the evolution of drug resistance. Selection acts on the phenotypic consequences of resistance mutations, which are influenced by the genetic variation accrued in particular genomes. Here I discuss recent studies that establish a mechanism by which the molecular chaperone Hsp90 can alter the relationship between genotype and phenotype in an environmentally contingent manner, influencing the course of evolution. Hsp90’s role in fungal drug resistance is to enable specific cellular signaling pathways and crucial responses to the cellular stress exerted by antifungal drugs. Harnessing Hsp90 holds great promise for treating life-threatening infectious diseases.

Bio: Leah Cowen pursued her doctoral research with Jim Anderson and Linda Kohn at the University of Toronto focused on the genomic architecture of adaptation to antifungal drugs. As a postdoctoral fellow with Susan Lindquist at the Whitehead Institute, she then investigated how the molecular chaperone Hsp90 impacts on fungal evolution and phenotypic diversity. Since 2007, Leah has been a Canada Research Chair in Microbial Genomics and Infectious Disease in the Department of Molecular Genetics at the University of Toronto. Her laboratory focuses on the molecular mechanisms by which alterations in cellular signaling and stress responses impact on fungal evolution, drug resistance, and pathogenesis and further explores how these mechanisms can be harnessed for treating fungal disease.

 
 
 

Pamela Silver, Ph.D.
Professor, Harvard Medical School
Department of Systems Biology
Designing Biological Systems

Abstract: Biology presents us with an array of design principles. From studies of both simple and more complex systems, we understand at least some of teh fundamentals of how Nature works. We are interested in using the foundations of biology to engineer cells in a logical way to perform certain functions. In doing so, we learn more about the fundamentals of biological design as well as engineer useful devices with a myriad of applications. For example, we are interested in building cells that can perform specific tasks, such as counting mitotic divisions, measuring life span and remembering past events. Moreover, we design and construct proteins and cells with predictable biological properties that not only teach us about biology but also serve as potential therapeutics, cell-based sensors and factories for generating bioenergy. In doing so, we have made new findings about how cells interact with the environment.

 
 
 

Dana Pe'er, Ph.D.
Assistant Professor, Columbia University
Department of Biological Sciences
Driving Mutations: Lessons from Yeast and Cancer

Abstract: We will discuss methods that harness gene expression to identify genetic variants that influence a trait of interest. Our premise is that much of the influence of genotype on phenotype is mediated bychanges in the regulatory network and these can be inferred using gene expression. We will demonstrate two such methods: Camelot, an algorithm method that integrates genotype and gene expression collected in a reference condition (un-drugged) and phenotype data to predict complex quantitative phenotypes in entirely different conditions (drug response) and identify causal genes that influence these traits. We systematically applied our algorithm to a collection of yeast segregants to predict the response to 87/94 drugs and experimentally confirmed 22/24 gene-drug interactions. Our second method,Conexic, a novel Bayesian Network-based framework to integrate chromosomal copy number and gene expression data to detect genetic alterations in tumors that drive proliferation, and to model how these alterations perturb normal cellgrowth/survival. The underlying assumption to our approach is that significantly recurring copy number change, coinciding with its ability to predict the expression patterns varying across tumors, strengthens the evidence of a gene’s causative role in cancer. We applied Conexic to a melanoma dataset comprising 62 tumor samples and correctly identified most known ‘driver’events, while also connecting these to their known targets (e.g. MITF). In addition, our analysis suggests a number of novel drivers, including a number of genes involved in regulation of protein trafficking and endosome biology in this malignancy. Preliminary experimental validation supports several of these findings.

 
 
 

Charles Peskin, Ph.D.
Professor, Courant Institute of Mathematical Sciences, New York University
Department of Mathematics & Neural Science
A look-ahead model for the transcriptional dynamics of RNA polymerase

Abstract: The look-ahead model for the transcriptional dynamics of RNA polymerase postulates that there is a window of activity (a subset of the transcription bubble formed by RNA polymerae as it separates the two strands of DNA) within which ribonucleoside triphosphates (rNTP) may be reversibly bound to the template strand of the DNA before being covalently linked to the nascent RNA chain. If the window of activity is large enough to accommodate more than one base pair, then a kind of parallel processing is possible, in which the different rNTP can be selected in advance and held in readiness until they are needed, thus enabling faster synthesis of RNA. In this talk, we derive the statistical distribution of the waiting times between forward moves of the RNA polymerase molecule, ad compare these theoretical results to experimental data. The implications of look-ahead for the error rate of transcription will also be discussed.

Bio: Charles S. Peskin was born on April 15, 1946, in New York City. He studied Engineering at Harvard (A.B., 1968), and Physiology at the Albert Einstein College of Medicine (Ph.D., 1972). In 1973, he joined the Courant Institute of Mathematical Sciences, New York University, where he is now a Silver Professor, Professor of Mathematics, and Professor of Neural Science. He is also currently an A.D. White Professor-at-Large of Cornell Univerisity. Peskin's honors include a MacArthur Fellowship (1983-1988), the Mayor's Award for Excellence in Science and Technology (NYC, 1994), and the George David Birkhoff Prize in Applied Mathematics (AMS/SIAM, 2003). He is a Fellow of the American Institute for Medical and Biological Engineering (1992), of the American Academy of Arts and Sciences (1994), and of the New York Academy of Sciences (1998); and a Member of the National Academy of Sciences (1995) and of the Institute of Medicine (2000). Peskin's field of research is the application of mathematics and computing to medicine and biology, especially in the areas of heart physiology, neural science, and biomolecular motors. He is especially known for the immersed boundary method, a general computational framework for problems of fluid-structure interaction, like that posed by the mechanics of the human heart.

 
 
 

Saeed Tavazoie, Ph.D.
Associate Professor, Princeton University
Department of Molecular Biology
Predictive behavior within microbial genetic networks

Abstract: Through a combination of physiological observations, in silico simulations, and laboratory experimental evolution, we provide evidence that intracellular regulatory networks are capable of predictive behavior in a fashion similar to metazoan nervous systems. Our observations challenge the dominant homeostatic framework and reveal 'psychological' constraints on the evolution of intracellular regularly networks.

 
 
 

John Tyson, Ph.D.
University Distinguished Professor, Virginia Tech
Department of Biological Sciences
Temporal Organization of the Cell Cycle

Abstract: The coordination of growth, DNA replication and division in proliferating cells can be adequately explained by a 'clock + checkpoint' model. The clock is an underlying cyclical sequence of states; the checkpoints ensure that the cycle proceeds without mistakes. From the molecular complexities of the control system in modern eukaryotes, we isolate a simple network of positive and negative feedbacks that embodies a clock + checkpoints. The model accounts for the fundamental physiological properties of mitotic cell divisions, evokes a new view of the meiotic program, and suggests how the control system may have evolved in the first place.

Bio: John Tyson is a University Distinguished Professor in the Department of Biological Sciences at Virginia Tech. His research interests are network dynamics and cell physiology, especially regulation of the cell division cycle. He received his Ph.D. in chemical physics from the University of Chicago. He is a past president of the Society for Mathematical Biology and a former co-chief editor of the Journal of Theoretical Biology. Honors include the Bellman Prize in Mathematical Biosciences, the Aisenstadt Chair in Mathematics, Research Fellow at Merton College (Oxford), and Virginia Scientist of the Year.

 
 
 

Gunter P. Wagner, Ph.D.
Chairman, Yale University
Department of Ecology and Evolutionary Biology
From Mechanisms of Single Gene Regulation to the Evolution of Gene Regulatory Networks

Abstract: The evolution of organismal complexity is, to a large part, based on the evolution of differential gene regulation. Hence the population genetic and molecular processes that lead to the origin of novel expression patterns are the core of the evolutionary developmental biology. We investigate these processes in the context of the evolution of placentation in the mammals, specifically the evolution of the endometrial stromal cell (ESC). The ESC is the maternal cell type that forms the maternal-fetal interface. In short we find that there are two types of unconventional genetic changes involved in the evolution of the ESC gene regulatory network: 1) novel cis-regulatory elements, and 2) adaptive changes in the functional properties of transcription factor proteins. These findings contradict the current paradigm of developmental evolution which assumes a conserved developmental "toolset" (i.e. the protein functions do not change) and cis-regulation evolves through the modification of ancestral CREs. It is thus likely that innovation and adaptation may be based on different kinds of genetic changes. Innovations are facilitated by transposable elements and induce changes in the transcription factor proteins, while most of adaptive modifications of development are based on modifications of existing CREs.

 
 

Recruitment Seminars: 2008/2009

Kevin Chen, Ph.D.
New York University
Department of Biology
Macro- and mirco- evolution of gene regulation mediated by microRNAs

Abstract: Studying the evolution of gene regulation is important because (1) regulatory mutations can cause disease (2) understanding cis-element evolution will help us design algorithms for predicting these elements (3) it is important for understanding phenotypic evolution. I will discuss the evolution of animal microRNAs and their binding sites at two different time scales. MicroRNAs are small RNAs that post-transcriptionally regulate their target mRNAs and have been implicated in many biological processes, including cancer and viral defense. At the macro-evolutionary time scale, we show that microRNA genes are well conserved but their targets have diverged rapidly. At the micro-evolutionary time scale, we use human SNP genotype data to demonstrate selective constraint on microRNA sites, implying that polymorphisms in these sites are candidates for causal disease variants. We also use our approach to identify a set of non-conserved microRNA sites in genes co-expressed with the microRNA.

 
 
 

Sen Cheng, Ph.D.
Professor, University of California
The Sloan-Swartz Center for Theoretical Neurobiology
Memory Formation and Storage in the Hippocampus

Abstract: To form memories about events, such as our last birthday party, we need a brain region called the hippocampus. This structure consists of anatomically and functionally distinct subregions: dentate gyrus (DG), CA3, and CA1. A consensus is emerging that episodic memories are initially stored in the patter of synaptic connection strengths between hippocampal neurons. However, the sites and mechanisms of this plasticity remain uncertain. In the first part of my talk, I wll present a quantitative model to compare different memory storage strategies. I will show that for the rat hippocampus, memory storage in the DG yields the highest storage capacity, especially when taking into account the birth and death of neurons that occur only in the DG. In the second part, I will discuss experimental evidence for how hippocampal activity changes when animals learn about a novel environment. Using a novel computational analysis method, called adaptive filtering, I find that neurons' spatio-temporal coordination quickly improves towards the level found in familiar environments. Intringuingly, this improvement might be driven by coordinated neural activity in a distinct state of the network.

 
 
 

Eric Deeds, Ph.D.
Harvard Medical School
Department of Systems Biology
Dynamic individuality in protein-protein interaction networks

Abstract: Protein-protein interactions play a crucial role in all cellular processes, from the regulation of gene expression to the transduction and processing of extracellular signals. Over the past decade, high-throughput techniques such as Yeast 2-Hybrid (Y2H) and Tandem Affinity Purification (TAP-tagging) have provided a global picture of what the entire protein-protein interaction (PPI) network in certain organisms might look like. While these methods are often quite noisy (with potentially high rates of false positives and false negatives), they have nonetheless served as the substrate for a large body of work aimed at characterizing or explaining the general topological structure of these networks. Such purely topological studies are limited, however, by the fact that they consider a static description of an inherently dynamical system. A full characterization and understanding of the behavior of PPI networks clearly requires that one be able to describe and understand the dynamics of hundreds to thousands of objects physically interacting with one another. In this work we employ recently developed rule-based modeling techniques to perform the first large-scale stochastic simulations of the PPI network found in the cytoplasm of yeast cells. These simulations reveal that cells prepared in identical initial conditions will, at steady state, differ considerably from one another in terms of teh identities of the large protein complexes found in each. Our results indicate that such dynamic individuality may arise in many complex interaction and signaling networks.

 
 
 

Olivier Elemento, Ph.D.
Weill Medical College of Cornell University
Decoding the regulatory genome and proteome

Abstract: Deciphering the non-coding regulatory genome has proved a formidable challenge. Despite the wealth of available gene expression data, there currently exists no broadly applicable method for characterizing the regulatory elements that shape the rich underlying dynamics. I will present a general framework for detecting such regulatory DNA and RNA motifs that relies on directly assessing the mutual information between sequence and gene expression measurements. Our approach makes minimal assumptions about the background sequence model and the mechanisms by which elements affect gene expression. This provides a versatile motif discovery framework, across all data exceptional sensitivity and near-zero false-positive rates. Applications from yeast to human un cover novel putative and established transcription-factor binding and miRNA target sites, revealing rich diversity in their spatial configurations, pervasive co-occurances of DNA and RNA motifs, context-dependent selection for motif avoidance, and the strong ipact of post-transcriptional processes on eukaryotic transcriptomes. I will present an extension of this approach to discovering the protein motifs that underlie protein behavior, e.g., localization, expression, half-life, interactions, etc. The systematic application of this new approach to eukaryotic protein behavior profiles reveals known protein regulatory motifs (post-translational modification sists, localization signals, etc) and many novel ones. It also reveals widespread motif co-occurances (suggesting combinatorial regulation at the post-translational level) and non-random spatial distribution of certain motifs in the prima sp; These information-based approaches represent a major contribution to the ongoing effort to systematically characterize eukaryotic regulatory elements and understand their role in complex processes such as disease development.

 
 
 

Uri Hershberg, Ph.D.
Professor, Yale School of Medicine
Department of Pathology
What Germline and Mutated V Genes Tell us about Selection

Abstract: The adaptive immune system functions through the use of its variable repertoires, of both B cells and T cells. To understand its function we must understand how these repertoires change and adapt throughout a single immune reaction and throughout a lifetime. In this talk I would like to focus on the B cell repertoire. This repertoire develops, first, by its germline diversity and, second, through mutation and selection following specific immune reactions and affinity maturation. Knowledge of how the B cell repertoire develops in health can also inform us as to its development in disease, showing us to what extent selection of specific repertoires effects autoimmunity and the development of B cell lymphomas.
Germline diversity influences the diversity of B cells following mutation. The B cells are unique amongst mammalian cells in that they have evolved to function under high rates of mutation and to generate viable and varied mutants. The B cells are unique amongst mammalian cells in that they have evolved to function under high rates of mutation and to generate viable and varied mutants. We compared the codon usage of different complementarity determining (CDR) and framework (FW) regions of germline V genes both to each other and to non-mutation control regions. Surprisingly, we have found that while all V genes have evolved to generate variable progeny under high rates of mutation. L and I light chain V gene families differ in the extent to which they risk their potential viability and in the extent of their diversity upon mutation. Furthermore, it appears that the general traits of amino acids have an impact on determining selection.
Working on the basis of these discoveries we have developed a novel statistical test that follows affinity maturation and selection through the patterns of mutation generated by affinity maturation. We validated our new methods by testing them both on synthetic data generated from a stochastic simulation of B cell colonal expansion, and on in vivo microdisected sequences from lg transgenic mouse models, in which positive selection was expected to be a significant force. Our results indicate that our method overcomes previous problems in differentiating between positive selection in CDR and negative selection in the FW. We next utilized our method to analyze published sequence data from diffuse large B-cell lymphomas. Surprisingly, we found that previous indications of positive selection there were unfounded, and that this types of cancer was not as varied as previously thought.

 
 
 

Xuhui Huang, Ph.D.
Postdoctoral Scholar, Stanford University
Department of Bioengineering
Computational tools for enhancing conformational sampling of biological macromolecules

Abstract: Conformational changes are crucial for a wide range of biological processes including biomolecular folding and the operation of key cellular machinery. Probing the mechanisms of conformational changes at atomic resolution is difficult experimentally and computer simulations may complement experiments by providing dynamic information at an atomic level. One of the main challenges for computer simulations is the insufficient sampling, interesting conformational changes occur at timescale of at least microsecond while atomic simulations tend to be about nanoseconds. Popular enhanced sampling algorithms such as Replica Exchange Method (REM) and Simulated Tempering (ST) use high temperatures to help systems cross energetic barriers, while their efficiency is limited by the entropic barriers, I will introduce the Adaptive Seeding Method (ASM) for studying the thermodynamics of conformational changes of biological macromolecules. By applying the ASM to RNA hairpin folding I will demonstrate that it is significantly more efficient than REM and ST. Only local equilibrium is necessary for AMS so very short seeding simulations may be used. Markov State Models (MSMs) are then used to extract the global equilibrium populations from these short simulations. Finally, I will show our simulation results of a more complex system: RNA Pol II transcription complex. Effect of single point mutations and different protonation states of a critical histidine residue in the polymerase active site has been studied. The mutation studies were consistent with experimental observations while the protonation work allowed us to predict that the histidine must be protonated to stabilize interactions between RNA polymerase and an incoming NTP.

 
 
 

Chen Hou, Ph.D.
Santa Fe Institute
Optimization, Constraint and Control by the Networks - Three case studies in Systems Biology

Abstract: Many research works in systems biology develop bottom-up models for understanding how the higher level behaviors of complex biological systems emerge from interactions among individual components across multiple scales. In this talk, I take a top-down view for understanding why the higher level behaviors of complex biological systems are the way they are, from the angle of evolution, physical design, optimization rules, etc. I will show how the higher level patterns optimize, constrain and control the biological systems in three case studies. The first one is about the fractal-like networks in pulmonary and cardiovascular systems. I will talk about how the networks optimize the systems' adaptation to changing resources and demands. The second one is about energy budget over ontogenetic growth. I will talk about how networks constrain growth and reprogram the metabolism when perturbed. The third one is about the role of blood-glucose control in treating sepsis. I will briefly talk about how the "causes" of the complex disease can be downwardly controlled by the "results" - the symptom of the complex disease.

 
 
 

David G. Miguez, Ph.D.
Postdoctoral Fellow, Harvard University
Systems Biology Department
Bistable response to anti-cancer drugs in the Akt-mTOR pathway

Abstract: The crosstalk between the Akt and the mTOR pathways results in several feedback regulations of the signaling. The combination of these positive and negative feedback loops results in a nonlinear response that influences the outcomes of cancer drug therapies. These nonlinearities often lead to, for instance, switch-like behavior or hysteresis and determine the optimal drug concentration, since its specificity is lost at high concentrations. Using a reversible inhibitor (i.e., it does not form a covalent bond with its target and can be washed out) we perturbed the AKT/mTOR pathways to test for bistability. Two separate sets of cells, one of them pre-treated with the inhibitor, were further treated with several dilutions of the same inhibitor. Measurements of the pathway activity demonstrated the existence of bistable behavior. This "memory" of the pathway has unprecedented characteristics, with the pre-treated cells exhibiting increased resistance to the inhibitor (during the second inhibition, more drug is needed to produce the same anti-cancer effect in the cells). Experiments where the mTOR pathway is also inhibited exhibited no bistability, showing that the ability to "remember" arises from the crosstalk between the Akt and mTOR pathways in the form of feedback loops.

 
 
 

Arjun Raj, Ph.D.
Massachusetts Institute of Technology
Nature, nuture or just dumb luck: gene expression variability and cell fate

Abstract: Gene expression is remarkably imprecise, leading to significant cell-to-cell variability in the numbers of mRNAs and proteins even in genetically identical populations. This surprising finding raises a couple of questions: can cells sometimes exploit this variability for their own benefit? Conversely, do cells reduce the impact of variability in order to produce reliable outcomes in other contexts? We explored the first question by studying the random transitions of the soil bacterium B. subtilis to the "competent" gene. However, while unicellular organisms require stochastic variability to exist in multiple states simultaneously, one might expect multicellular organisms (with developmentally controlled cell fates) to have mechanisms designed to buffer this variability. To see if and how organisms do this, we studied the guy formation pathway in embryonic development in C. elegans. We found that the normal guy development pathway is remarkably robust, but this robustness can be destroyed by mutations to a single gene that result in wildly varying embryonic fates. We have shown that these different fates result from the now variable expression of a key upstream regulator in the rewired mutant gut pathway. These results suggest that redundancy in developmental pathways can serve to mask and buffer otherwise hidden sources of gene expression variability.

 
 
 

Orkun Soyer, Ph.D.
Microsoft Research - University of Trento Centre for Computational and Systems Biology (CoSBi)
Studying biological networks: An evolutionary perspective

Abstract: Biological networks the result of evolution, rather than design. Thus, understanding how evolutionary dynamics affect networks is important for accomplishing a complete understanding of how these systems work. Here, I will present how we can distill key information on network structure and dynamics using dynamical network models and evolutionary simulations. Further, the talk will illustrate how key evolutionary processes such as duplication, species interactions and selection can result in the emergence of certain biological properties such as robustness.

 
 
 

Haiyuan Yu, Ph.D.
Harvard Medical School
Department of Genetics & Department of Cancer Biology at Dana-Farber Cancer Institute
Understanding Large-scale Interactome Networks

Abstract: Proteins function through interactions with bio-molecules, DNA and other proteins especially. The set of all molecular interactions in an organism is its "interactome". More specifically, interactome is the sum of all protein-protein interactions. Current yeast interactome network maps contain several hundred molecular complexes, with limited and at times controversial representation of direct binary interactions. We carried out a comparative quality assessment of current yeast interactome datasets, finding that high-throughput Y2H dataset covering ~20% of all yeast binary interactions. Both Y2H and affinity purification followed by mass spectrometry (AP/MS) data are of equally high quality but are fundamentally different and complementary, producing networks with dissimilar topological and biological properties. Compared to co-complex interactome models, the binary Y2H map is enriched from transient signaling interactions and inter-complex connections, and shows a highly significant clustering between essential proteins. Protein connectivity correlates with genetic pleiotropy, not with essentiality. Furthermore, topological analysis of interactome networks can help identify key proteins involved in various human diseases, especially because many diseases result from disrupted interactions among proteins. In sum, mapping and analyzing interactome networks will lead to better understanding of human disease, especially cancer.

 
 
 

Chris Wiggins, Ph.D.
Associate Professor, Columbia University
Department of Applied Physics and Applied Mathematics, and Center for Computational Biology and Bioinformatics
Learning Networks from Biology, Learning Biology from Networks

Abstract: Both the 'reverse engineering' of biological networks (for example, by integrating sequence data and expression data) and the analysis of their underlying design (by revealing the evolutionary mechanisms responsible for the resulting topologies) can be recast as problems in machine learning: learning an accurate prediction function from high-dimensional data. In the case of inferring biological networks, predicting up- or down- regulation of genes allows us to learn ab intio the transcription factor binding sites (or 'motifs') and to generate a predictive model of transcriptional regulation. In the case of inferring evolutionary designs, quantitative, unambiguous model validation can be performed, clarifying which of several possible theoretical models of how biological networks evolve might best (or worst) describe real-world networks. In either case, by taking a machine learning approach, we statistically validate the models both on held-out data and via randomizations of the original dataset to assess statistical significance. By allowing the data to reveal which features are the most important (based on predictive power rather than overabundance relative to an assumed null model) we learn models which are both statically validated and biologically interpretable.

 
 
 

Andrew Yates, Ph.D.
Professor, Emory University
Department of Biology
The Ecological Dynamics of Immune Systems and Pathogens

Abstract: Immune systems and pathogen populations are dynamic entities, and in this talk I hope to convince you that a richer understanding of both can emerge by combining mathematical models and experimental data. I'll show how close contact between models and data can shed some light on some basic questions in infectious disease and immunology. For example - what governs that dynamics of acute malaria infection? How do T cells sense their environment and regulate their numbers? How fast can a T cell find its target, or in other words how much of a CTL-cell-based immune response does a vaccine need to generate to be effective Should we worry about 'squeezing out' existing immunological memory in an individual by administering more and more vaccines? Why is progression to AIDS in HIV infection so slow?
The collaborations involved in addressing these questions have required quite different modeling approaches, different levels of detail and complex biology on different scales. What they have in common, though, is that the dynamical models can help us discriminate between alternative mechanisms and can make remarkably accurate predictions of how immune responses and pathogens behave. I'll show that sometimes we can get surprisingly far by neglecting a lot of the detail that we hold dear. However, we often gain the most insight when we can reject our favorite model - and we usually uncover more interesting biology in the process.

Click here to log in