Nick Goldman, Ewan Birney, and their colleagues at the EMBL-European Bioinformatics Institute have, incredibly, developed a way to store data in synthetic DNA, the biochemical material that carries cellular genetic information.
Most data storage mediums require a consistent electricity supply, and those that don’t often rapidly degrade. Goldman and his associates realized that DNA might pose a solution to this problem. DNA, with the capacity to carry the vast genetic information of all of earth’s life forms, can store enormous amounts of data. It is also incredibly long-lasting—samples drawn from 10,000-year-old biological matter are still “readable” using modern DNA sequencing techniques. As Goldman explains, DNA contains other desirable characteristics: “It’s also incredibly small, dense, and does not need any power for storage, so shipping and keeping it is easy.”
There were still major problems to overcome. Modern DNA synthesis technology allows only short strands of DNA to be made. Also, DNA, which is composed of a four-symbol code (A, G, C, T) is difficult to sequence with high fidelity when it includes sequential repetitions of the same symbol. The team at EBI solved this problem by creating a non-repeating code on partially overlapping fragments of DNA. The computer being used to read the sequence of the DNA can re-assemble the information from the fragments to fully decipher the code.
Appropriately, one of the first files Goldman stored in DNA form was James Watson’s and Francis Crick’s 1953 publication of the “double-helix” structure of DNA.
Thanks to Goldman and his colleagues, we can now use DNA to store an unimaginable amount of the data that make so many of our digital-era values possible. For this, they richly deserve our recognition and appreciation.