Our understanding of the development of life seems a distant
and elusive prospect, as cellular biology is more complex than ever conceived.
Assembly theory, as presented by Lee Cronin (1), aims to describe a process that could accelerate evolution, through conceiving of biological complexity as a function of its derivation through algorithmic processes rather than its current state, which is unfathomable in size, and seems impossible to have occurred through random processes. The aim is to demonstrate that evolutionary steps are smaller and more probable than previously conceived.
An analogy given in paper by Hector Zenil (2) is writing the
number pi to many decimal places. It would be impossible to arrive at this
number by randomly typing digits even seemingly an infinite number of times as
the number of incorrect combinations is astronomical. However, a computer
programme which produces pi to millions of digits could be based on a programme
which has several hundred of digits. The computer programme itself could be
represented as a string of digits. So, if you are still in the business of
typing out random numbers you would be more likely to arrive at the programme
that can produce pi to millions of digits than pi to a million digits,
improbable though it still is. The idea is that is still a random process, with
bits having the tendency to stick to one another through chemical bonds which
can be formed and broken, with environmental selection being the only guiding
hand. Off course in the prior analogy a computer programme that produces pi,
requires a computer, a language, and a memory space.
In essence this is an example of a data compression process;
something which is widely utilised in data storage to preserve precious memory
space.
Human DNA contains a huge amount of data, about 3.2 billion
base pairs or 6.2 billion bits of binary code as DNA is constructed form 4
different bases, or less than one gigabyte of data. By comparison Microsoft
windows might take about 40 GB of data on your hard drive. It is remarkable
that the most complicated thing that we know- us, can be coded for a by a
fraction of the data that is used to code Microsoft Windows. Either we are not
that complicated- which seems unlikely; there is some data compression at play
that is beyond the likes of which we are capable of engineering; or there is
data storage outside of the genome. The latter is possible as DNA has no potential
to propagate outside of a cell though it is quite environmentally resilient,
hence the role that DNA evidence has come to play in resolving some crimes.
The question remains what information is stored outside of
the DNA within a cell. A single cell is both the most advanced factory, the
most advanced biochemistry lab and the most sophisticated computational machine
that we have ever seen. It is possible that the DNA is not the sum total of all
the instructions to create a living organism but all the instructions to create
a unique organism., i.e., it contains the information for our variability but
not all the instructions to create a living organism. As the DNA is
functionless without a cell then some of the information to create the cell
could be taken for granted as that information is inherent by virtue of the
cell being alive, even if it were not written down in the DNA.
The Cronin paper is certainly not without its controversies.
The idea of its application as a general theory of evolution, from single
molecules all the way to fully fledged organisms seems overblown, and several scientists
have asserted that the paper lacks credence to prior studies which invoked very
similar or identical concepts.
Anyhow, increasing our understanding of cellular processes remains one of the greatest scientific goals, and one that is likely to yield enormous practical applications. It is likely given the complexity and challenges that lie ahead that multidisciplinary teams from many fields will need to collaborate in future projects.
References