Metabolites are small chemical molecules that play many important roles in living organisms. The field of Metabolomics aims to detect, identify and quantify them in order to solve biological questions. These questions can arise in many different fields, such as medicine and quality assessments of foods or environmental areas. This lecture will present some of these applications, but will focus on the challenges in analysing these data, obtained by complicated data-processing pipelines, and highly multidimensional in nature. A common question, for instance, is to indicate differences between two groups of samples, one control group and one treatment group (maybe corresponding to healthy and diseased, respectively). The overall aim is to understand biology at the level of, e.g., physics, where model-based predictions achieve high accuracies - metabolomics is an essential part of solving this puzzle.

A central topic in plant breeding and genetics is the study of genotype by environment interaction (GxE). GxE occurs when differences in performance (phenotype) between plants with different genetic constitutions (genotypes) are a function of the environmental conditions. Modelling of GxE is relevant for insight in adaptation. Climate change forces plants to adapt to higher temperatures and drought. Plant breeders and geneticists try to identify the genetic factors underlying adaptation. For the modelling of GxE various classes of statistical models have been proposed. A classic approach to GxE describes it by a joint regression of a sample of genotypes on one or more environmental predictors, where the GxE is expressed as heterogeneity in the slopes of the genotypic regressions. Assumptions on the character of the environmental predictors determine whether the model is a linear (regression) model or a bilinear model (both slope and predictor are estimated from the same phenotypic data). In recent years, the amount of data that can be measured on growing plants has increased sharply. Variation at the DNA level has become available at very high resolutions. Similarly, phenotypes and environmental conditions can be followed at high temporal and spatial resolutions. Finally, all kinds of omics data (gene expression, proteins, metabolites, methylations) can be generated. Modelling of plant growth and development as a function of DNA variation, omics data and environmental inputs offers interesting statistical challenges. In the presentation, several classical and new approaches to the statistical modelling of GxE and plant growth will be presented.

Nanotechnology has great prospects of advancing technical capabilities to levels that go far beyond the technology of today. We can learn from Nature that machines at the scale of molecules can, in principle, be produced. Actualy assembling useful machines is still a formidable challenge. However, important steps have already been taken, and research requires the integration of many disciplines: physics, chemistry, materials science, biology, information science, and more. I will present a few examples of experiments and techniques that have been developed, although this account will be colored by the perspective of my own research work. This will show that we have learned how to understand and how to control properties of matter down to the level of individual atoms and individual molecules.

Since plants are sessile organisms, they can not displace themselves and have to make do with the environment they find themselves in. As a consequence, the ability to sense and adaptively respond to environmental conditions is of critical importance to the survival of plants. These responses involve all kinds of different decisions, in which direction a plant organ (leaf, root) should grow, where new organs should be formed, when flowers should be formed.

In this lecture, I will illustrate how plants count and memorize signals, keep time, and integrate local and long distance signals for their decision making. Furthermore, I will illustrate how using “computational mindset” helps us to better understand the type of computations that plants perform to reach their adaptive decisions.

DNA molecules contain a second layer of information on top of the classical genetic information. This second layer is geometrical/mechanical in nature and guides the folding of DNA molecules inside cells. With the help of a new Monte Carlo technique, Mutation Monte Carlo, and of graph theory we demonstrate that the degeneracy of the genetic code allows for multiplexing of the two information layers. We specifically show that mechanical cues on the DNA molecule can place nucleosomes (DNA-wrapped protein cylinders) with single base-pair resolution anywhere on the genome of baker’s yeast. This suggests that there is plenty of space for other layers of information, e.g. the translation speed in ribosomes - important for the co-translational folding of proteins. We demonstrate that it is indeed possible to design synonymous mutations to reposition nucleosomes on genes under the additional constraint of keeping the translation speed pattern nearly intact. This suggests that DNA might carry (at least) three layers of information on top of each other.

Microbes such as bacteria and yeasts actively optimise their cellular growth rate by tuning concentrations of catalytic enzymes and ribosomes. They are able to switch metabolism to accomodate changes in food substrates, mount stress responses, and even shut down the entire cell and go in growth arrest when the need arises. They do all this without having direct knowledge of changes in the environment such as the availability of food sources. In this talk I will present a general theory how cells might be able to solve this conundrum, and how they might implement this using biological mechanisms such as gene expression.

Our genome is organized in long strings of nucleosomes, consisting of about 150 base pairs of DNA and 8 histone proteins, which are spaced by 10-80 base pairs of linker DNA. Strings of nucleosomes further fold into dense chromatin fibres. The structure of these fibres has remained largely obscure because both high-resolution structural techniques like NMR, electron microscopy and X-ray crystallography and optical (super-resolution) microscopy techniques can hardly resolve the path of the DNA in such fibres. However, exciting new progress has been reported in the field, and the contours of a physical understanding of chromatin structure and dynamics are now taking shape. It turns out that the distribution of nucleosomes over the DNA is encoded in the sequence of the DNA itself, in a second layer of information on top of the sequence encoding for proteins. Simple statistical physics models can now largely reproduce the experimentally observed distribution of nucleosomes over our genome. At the next level, the higher-order folding of chromatin fibres is governed by nucleosome-nucleosome interactions. Using single-molecule force spectroscopy we were able to unravel these chromatin structures. From such pulling experiments we deduced that the length of the linker DNA, which is defined by the sequence dependent positions of the nucleosomes, determines if nucleosomes interact and if they interact with neighbours or next-neighbours. This stacking of nucleosomes puts tight mechanical constraints on the linker DNA. Monte-Carlo simulations suggest that there may even be a sequence dependence of the folding of chromatin fibres. Overall, there is increasing evidence that eukaryotic genomes not only evolved to optimize protein structure and functionality, it appears that part of the regulation of their genes may also be encoded genetically. Accordingly, access to the DNA is regulated by variations in nucleosome positions and higher-order folding, which forms a second and third layer of information and can be multiplexed with protein encoding.

“The field is continuously fascinating. In fact, I’ve often thought a good title for either a book or a lecture about immunology would be Endless Fascination”

Dr William Erwin Paul (2012)

The immune system of plants and animals is very complex and we are still in the very beginning of understanding this complexity. Every small step we take in this direction leaves us once more fascinated with the efficiency, and preciseness of the system. In my lecture I will first explain what is often known as the main function of the immune system: discriminating self (harmless) from non-self (dangerous). The immune system has several mechanisms to stay tolerant to healthy cells of the host (self) while mounting very efficient immune responses to foreign pathogens. During an organ transplantation self/non-self discrimination plays a very important role. Though it is meant to be harmless, a donor organ is seen as a foreign invasion to the host body. I will summarize the computational tools we developed to predict the “foreignness” of a solid organ transplantation.

To survive, all living cells must be able to perceive changes in their environment and adapt accordingly. Proteins play an essential role in these processes by reacting to environmental changes and forming networks to pass on signals to other cellular machinery. Sensor proteins change conformation upon receiving a specific trigger. These changes are then propagated and amplified by networks of interacting proteins. Studying the intricate changes in signal transduction networks requires high resolution in both time and space, as highlighted by two examples. Starting at the very beginning of a signal transduction network, i.e. the actual perception, the molecular mechanisms involved in sensing light will be discussed. Further along the signal transduction network, amplifier proteins interact with many different proteins, as will be illustrated by the Ras protein involved in cell growth. The lecture will conclude by discussing how mutant amplifier proteins can disrupt signal transduction and cause tumours.

Compared to for example economists, evolutionary biologists are in the lucky situation that they can base themselves on a good microscopic mechanism: individuals reproduce almost, but not totally faithfully. Define the environment as anything outside an individual that impinges on its behaviour, now or in the future, and a population as a collection of individuals sharing the same environment. In that case we can use the frequency distribution of individuals over their physiological states as population state. If the number of individuals is sufficiently large the population process becomes deterministic. If the environment is given as a function of time the dynamics of the population state is linear, though possibly time varying. In real populations the feedback loop is closed: The environment is generated through some other process which has some linear functionals of the population state as its input. The boundedness of the world ensures that this dynamics converges to some attractor, which generally can be assumed to be ergodic.

Now assume that individuals may be distinguished by some inherited trait influencing their population dynamical behaviour. Since inheritance is never totally faithful sometimes a mutant will occur. Usually such a mutant differs but little from its resident parent (large mutational steps tend to be lethal and can thus be neglected). This mutant finds itself in an environment set by the resident population, but initially does not impinge on that environment. Therefore initially the dynamics of the mutant is linear. Let N denote the expected mutant population size. The linearity of the mutant dynamics combined with its positivity, primitivity and ergodicity guarantees that lim(ln(Nt))t =: ρ exists. ρ is what we call ”fitness”, defined in words as the asymptotic average relative rate of increase of a population in an ergodic environment. Fitness always is a function of two variables, the type of the individuals under consideration, and the environment in which they live. In the case of a novel mutant type the latter is set by the current resident types, so that fitness can be written as a (possibly multivalued) function of the invader trait value, and the trait values of the resident types, to be written as s(y|x1,...,xk), y the mutant type and (x1,...,xk) the resident types, or also as sx1,...,xk(y) to bring out the idea of a, resident dependent, fitness landscape.

The next step is to assume time scale separation: mutants appear so rarely that the population dynamics relaxes in between the appearance of new mutants. In that case we can base our analysis of the evolutionary process on no more than the properties of the function s. Under the indicated assumptions evolution can be represented as a random walk in 𝒳𝒳2 𝒳3, 𝒳 the trait space, directed by s: only mutants with a positive s can invade, and they do so with a probability that for small mutational steps is proportional to s.

Landmarks in the adaptive state space are formed by the evolutionarily singular points, defined as (for monomorphic residents) those points where dsxdyy=x = 0. These singular points include rest points of the evolutionary trajectory as well as points at which this trajectory branches, i.e., enters 𝒳2. Away from these singular points sufficient smallness of the mutational steps guarantees that invading mutants oust their progenitors.

The evolutionary tree gets pruned when the evolutionary trajectory passes the border of the coexistence region in 𝒳k.

ORTEC is one of the world’s leaders in optimization- and analytics solutions. We provide companies in a wide range of industries with the insights to become more efficient, more predictable and more effective. During this lecture you will not only learn more about ORTEC as a company, but also about some of our projects: Text Mining at the European Commission in Brussels, and dashboarding for the SWKGroep (company for children’s daycare).