CS369: What is Computational Biology?
Dr Matthew Science University of Auckland
What is Biology?
Copyright By PowCoder代写 加微信 powcoder
Biology is the study of life. This is a broad target!
– individual organisms
– populations of organisms
– evolving systems (populations of changing
organisms over long time scales)
– ecological systems (interactions between
diverse populations)
– organism subsystems (e.g. organs)
– cell biology
– genetics
– metabolism
– ethology (behaviour)
– the origin of life (how is life different from
non-life and how did life emerge in a universe
without it?)
– philosophy of biology (e.g. what is life? what
should / should not be considered an
organism? ethics (e.g. predator free NZ)
How is biology different (e.g. from physics)?
Gas in a container Gravity
P = VT F = G m1m2/r
https://commons.wikimedia.org/wiki/File:Translational_motion.gif
How is biology different?
Biological systems are complex
– lots of interacting parts
– parts are different from each other
(heterogeneity)
– the parts interact non-linearly
As a quick example of non-linearity, think about our relationship with oxygen. None is very bad, a little is good, more is better, too much is deadly!
linear non-linear non-linear
Complexity impedes simplification
High-dimensionality; Heterogeneity; Non-linearities;
…these things make systems difficult to simplify.
The effective simplification here assumes.. – each molecule is the same as every
– direct interactions between the
molecules is minimal
– the size of the molecules is
insignificant (essentially zero)
compared to the size of the container – P=VT!
It is very hard to make such simplifications here.
And when simplifications can be made, often they can not be generalized (transferred over to other situations).
This is not to say that simplification cannot be done in Biology!
As an example, biologists will often simplify the effects of a gene down to its “fitness” (i.e. how much it increases the likelihood of being able to reproduce).
From this simplification, many insightful equations have been derived to help us understand how populations evolve.
…but there are many biological situations that we would like to understand, where such simplification is not an option.
i.e. where simplifying would remove the elements that we want to understand!
i.e. where we are interested in the heterogenous, high-dimensional and/or nonlinear parts of the system.
Computational Methods
Computational methods are helpful in the investigation of complex (biological) systems.
Maybe worth taking a moment to remember that computers primarily enable us to do (huge) numbers of calculations, in an automated and error free way.
..let’s think about how that can help.
https://wehackthemoon.com/people/human-computers-made-moon-missions-possible
Ways that computers can help…
Simulation and Modelling Data Analysis and Visualization Experiment Automation
Simulation & Modelling
Mathematical biology has allowed us to create models of biological systems for centuries.
These models have had to be relatively simple if they are to be solvable using algebra
– low-dimensional
– limited amount/types of non-linearity
Let’s imagine an equation that describes how the number of sheep on an island grows over time…
x: the number of sheep
y: the number of wolves
⍺: parameter describing how fast they reproduce
𐌁: parameter describing how effective the wolves are at hunting the sheep
Setting the left hand side (LHS) of the equation to zero, we can identify that in general there is a single equilibrium of the population (i.e. where the number of sheep does not change).
0 = ⍺x – 𐌁xy
= x(⍺ – 𐌁y), which is true when
x = 0 (⍺ – 𐌁y) = 0
We can also see that when the number of wolves is above some threshold (𐌁y > ⍺) any sheep present will be hunted to extinction. as < 0. etc.
But this model treats y, the number of wolves as a fixed quantity. What happens when the number of wolves changes in a way that depends upon how many sheep there are available to eat --a fair assumption!
If we consider the number of wolves to also be changing over time, we have two equations. One that describes the rate at which the number of sheep is changing... (this is the same as before)
And a new one that describes how the wolf population is changing.
Solving this system (e.g. to identify equilibria) is more complicated than the previous. In this case it is still possible to do so using algebra...
But I am sure you can imagine that as things become more complicated, they become more difficult or impossible to analyze using just algebra...
And as we shall see computational methods can help!
High-dimensionality
Q: What is going on here? How do the birds do this? Which bird(s) are in charge? How is that decided? What is the underlying 'mechanism'?
Hypothesis (investigated by C. Reynolds): Each bird is following a few very simple rules, and the pattern simply emerges from these rules. No bird is 'in charge.'
Rule 1: Separation
Steer to avoid being too close to flock-mates -- collision avoidance.
Accelerate away from each flock-mate at rate inversely proportional to proximity. i.e. the closer you are to a flock-mate, the more you accelerate away from that flock-mate.
Images taken from 's website.
http://www.red3d.com/cwr/boids/
Rule 2: Alignment
Steer toward the average heading of local flock-mates.
Calculate the mean velocity of all flock-mates. Update velocity to be a little bit more similar to that value.
Images taken from 's website.
http://www.red3d.com/cwr/boids/
Rule 3: Cohesion
Move toward the average position of local flock-mates.
Calculate the mean position of flock-mates and accelerate toward that position.
Images taken from 's website.
http://www.red3d.com/cwr/boids/
Hypothesis (of C. Reynolds): Each bird is following a few very simple rules, and the pattern simply emerges from these rules.
Q: How could one test this hypothesis?
A. build a model!
p.10 of Strogatz, S. H. (2014). Nonlinear dynamics and chaos: With applications to physics, biology, chemistry, and engineering. Westview press.
Lots of modelling 'frameworks'
High-dimensionality, non-linearities and heterogeneity make it difficult or impossible to heavily simplify the systems we are studying. How do we come to understand these systems?
One answer is to model them.
These are lots of different modelling frameworks
- Difference Equations (MATHS162)
- Differential Equations (MATHS260)
- Partial Differential Equations (MATHS361)
- Cellular Automata (BIOSCI702)
- Agent Based Modelling (BIOSCI702)
- Markov Models
- Molecular Dynamics Modelling (BIOSCI702)
- Process algebras
- Artificial chemistries
- many others...
Goals of models
The 'boids' example I just presented is just one example of how models are used in biology.
- test a hypothesis
- sufficiency proof (show that X
suffices to explain Y)
- predict the future of a system
- communicate an idea
- intuition pump
- creativity pump (e.g. develop a
hypothesis)
The scientific method... where do models fit in?
Data Analysis and Visualization
(Big) Data Analysis
We live in an era of unprecedented quantities of data.
"[Facebook ]processes 2.5 billion pieces of content and 500+ terabytes of data each day. It's pulling in 2.7 billion Like actions and 300 million photos per day, and it scans roughly 105 terabytes of data each half hour"
We have so much data, we don't know what to do with it! We can't understand it directly.
This isn't the first time!
0.502351154604 0.868729618036 0.999966780444 0.860540338374 0.488189208866 -0.016301361194 -0.5163795995 -0.876688031036 vs. -0.999701037239 -0.852122368368 -0.473897525855 0.032598390268 0.530270815468 0.884413462409 0.999169621452 0.843477945111 0.459479903612 -0.0488867562536 -0.544021110889
(Big) Data Analysis
In a way that might be compared to the invention of graphs (17th c.?), computational methods can help us to understand data, by...
clustering, classifying, organizing, inferring relationships, visualizing and otherwise simplifying data
These can all help us to gain knowledge from information (data).
Perhaps the best known case of computers helping biology is the diverse study of DNA.
For instance, computers have made it possible to...
- assemble full genomes from observed short overlapping sequences (genome assembly)
- find repeated motifs within a sequence
- infer relationships between different species; how
long ago they separated on the tree of life, etc;
(phylogenetics)
- given a newly discovered gene, be able to rapidly
search for similar (but not identical) genes in a massive database of all known genes
We have substantial expertise here at UoA in this area, and we devote a fair amount of time to this area of Computational Biology.
ACTACTGGTCTACACACCCCGCGATCGGATTAT ACACTACTATTCACACACACACACGTAGGGGGG GCGATTATTATTATATTTATTCGGCTCTCTCTCTG CGCGCGGCGCGCGCGTAGTGATCGGTATGCTA CGTACGTAGCTAGCTGATCTGCATGTCGATCGG CGCGCGGCGGGGGGGAGAGAAATATGCGTTAT CCTCTCTACTACTATATCAATATTCATGGTAGGG TTGGGGGTTGGGCGCGGCGCGCGCCACGCGG GGAGATATATTACAACGTACGTACGATCGTACG ATCGTACGCATGCGTATGCATGCTGATGCATGC GTACGCAGGTCATGACTACTTCTGAGG
...ABCDEFGHIJKLMNOPQRSTUVW...
...ABCXEFGXIJKXMNOXQRSXUVW...
Data Science
Given a (large) collection of entities, and the relationships between them, how can we come to understand the system?
Are there any 'natural' groups among the entities? (clusters)
Are there 'more important' entities within the group (e.g. measures of 'centrality')
Are there repeated motifs of relationships within the system?
Computers and efficient algorithms can help with all of these questions.
Transferable Skillz
Data analysis methods are of course not only applicable to genome data.
e.g. analyzing social media data (twitter, facebook)
e.g. analyzing urban systems (traffic dynamics)
e.g. other areas of science (chemistry, physics, psychology)
'Data Scientist' remains a top paying career path.
https://www.careers.govt.nz/jobs-database/it-and-telecommunications/information-technology/data-analyst/ (Last updated 10 December 2019)
Automated Experimentation
Robots are invading (the lab)!
The ability to automatically repeat the same experiment (or variations of that experiment) is hugely valuable to science.
Not just making things easier, but enabling new kinds of science to be done...
Artificial Evolution
"Herein we present a liquid-handling robot built with the aim of investigating the properties of oil droplets as a function of composition via an automated evolutionary process. The robot makes the droplets by mixing four different compounds in different ratios and placing them in a Petri dish after which they are recorded using a camera and the behaviour of the droplets analysed using image recognition software to give a fitness value. In separate experiments, the fitness function discriminates based on movement, division and vibration over 21 cycles, giving successive fitness increases."
Gutierrez, J., Hinkley, T., Taylor, J. et al. Evolution of oil droplets in a chemorobotic platform. Nat Commun 5, 5571 (2014). https://doi.org/10.1038/ncomms6571
Philosophy of Biology & Philosophy of Modelling
Philosophy of Biology
Computational biology should be seen as complementary to other forms of biology, including observation, experimentation, mathematical modelling, and philosophy of biology.
Philosophy of biology has a longer history than the scientific investigation of philosophy (Aristotle, 3rd c. BCE). Philosophy of biology considers questions like...
- What is life?
- What is an organism? (Interesting work going on in
this area at the moment w.r.t. the 'microbiome')
- Is biology 'reducible' to chemistry? (consider the
'emergence' in the boids model.)
- Are there laws in biology similar to the laws of physics?
- How is it possible for an agent to act to satisfy its own needs?
Philosophy of Modelling
Computational biology is still a young area. It comes with its own sets of strengths and weaknesses.
- What is a good model?
- What can models tell us?
- When can we be sure that a model is predictive?
- Does the future of science involve giving up on
knowledge and just using (e.g.) machine learning to predict things?
These are philosophy of science questions, and we won't get into them in detail, but they are important things to consider, not just as a potential scientist, but as a citizen.
Increasingly decisions are being informed by models. In what situations is this (not) okay?
- Models to predict the weather
- Models to predict climate change
- Models to predict whether a convict should be
released from prison
My research...
What is life? What distinguishes living systems from non-living systems?
Living systems are perpetually falling apart, but managing to persist thanks to mechanisms of self-(re)construction.
- Does this make them capable of adapting in ways that non-living systems cannot?
- Does it provide a way for us to understand the idea of a system with its own needs and goals? and perhaps eventually some day, desires and (conscious) experience?
Closing Comments
Goals of Computational Biology
Understand biological systems, so as to be able to
- predict their future;
- effectively intervene (e.g. medical
intervention, ecological intervention);
- engineer 'biologically inspired' things (e.g.
AI, aircraft wings, insect-like robots (xenobots), algorithms, strategies for managing organizations, self-healing materials, etc.);
- create life or living technology;
- effectively define life, and distinguish it
from non-life (this is still an open problem!
(relevance to SETI)
Computational biology is interdisciplinary, and perhaps for that reason there are many closely related fields...
- Bioinformatics (data focused) - Mathematical Biology
- Systems Biology
- Applied Mathematics
- Data Science - ...
The differences between these areas is sometimes subtle or non-existent.
How is biology different?
Biological systems are exceptionally complex
- lots of interacting parts
- parts are different from each other
(heterogeneity)
- they interact non-linearly
Example of Non-linearity
Imagine a small population of cows eating grass. For a small population, the cows reproduce at a rate that is linearly proportional to their population size until e.g. there is insufficient food to support that growth... at which point the response becomes non-linear.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com