Assembly Theory Evolution compressed

Controversial theory unpacked

RNfinity | 11-05-2024

Our understanding of the development of life seems a distant and elusive prospect, as cellular biology is more complex than ever conceived.

Assembly theory, as presented by Lee Cronin (1), aims to describe a process that could accelerate evolution, through conceiving of biological complexity as a function of its derivation through algorithmic processes rather than its current state, which is unfathomable in size, and seems impossible to have occurred through random processes. The aim is to demonstrate that evolutionary steps are smaller and more probable than previously conceived.

An analogy given in paper by Hector Zenil (2) is writing the number pi to many decimal places. It would be impossible to arrive at this number by randomly typing digits even seemingly an infinite number of times as the number of incorrect combinations is astronomical. However, a computer programme which produces pi to millions of digits could be based on a programme which has several hundred of digits. The computer programme itself could be represented as a string of digits. So, if you are still in the business of typing out random numbers you would be more likely to arrive at the programme that can produce pi to millions of digits than pi to a million digits, improbable though it still is. The idea is that is still a random process, with bits having the tendency to stick to one another through chemical bonds which can be formed and broken, with environmental selection being the only guiding hand. Off course in the prior analogy a computer programme that produces pi, requires a computer, a language, and a memory space.

In essence this is an example of a data compression process; something which is widely utilised in data storage to preserve precious memory space.

Human DNA contains a huge amount of data, about 3.2 billion base pairs or 6.2 billion bits of binary code as DNA is constructed form 4 different bases, or less than one gigabyte of data. By comparison Microsoft windows might take about 40 GB of data on your hard drive. It is remarkable that the most complicated thing that we know- us, can be coded for a by a fraction of the data that is used to code Microsoft Windows. Either we are not that complicated- which seems unlikely; there is some data compression at play that is beyond the likes of which we are capable of engineering; or there is data storage outside of the genome. The latter is possible as DNA has no potential to propagate outside of a cell though it is quite environmentally resilient, hence the role that DNA evidence has come to play in resolving some crimes.

The question remains what information is stored outside of the DNA within a cell. A single cell is both the most advanced factory, the most advanced biochemistry lab and the most sophisticated computational machine that we have ever seen. It is possible that the DNA is not the sum total of all the instructions to create a living organism but all the instructions to create a unique organism., i.e., it contains the information for our variability but not all the instructions to create a living organism. As the DNA is functionless without a cell then some of the information to create the cell could be taken for granted as that information is inherent by virtue of the cell being alive, even if it were not written down in the DNA.

The Cronin paper is certainly not without its controversies. The idea of its application as a general theory of evolution, from single molecules all the way to fully fledged organisms seems overblown, and several scientists have asserted that the paper lacks credence to prior studies which invoked very similar or identical concepts.

Anyhow, increasing our understanding of cellular processes remains one of the greatest scientific goals, and one that is likely to yield enormous practical applications. It is likely given the complexity and challenges that lie ahead that multidisciplinary teams from many fields will need to collaborate in future projects.



1) Sharma, A., Czégel, D., Lachmann, M. et al. Assemblytheory explains and quantifies selection and evolution. Nature 622, 321–328(2023)

2) Hernandez-Orozco S, Kiani NA, Zenil H. 2018 Algorithmicallyprobable mutations Reproduce aspects of evolution, such as convergence rate,genetic memory and modularity. R. Soc. open sci. 5: 180399.