back to Actors and Projects

Oligoarchive - Intelligent DNA Storage for Archival

Start Date: 01-10-2019

End Date: 30-09-2022

Id: OLIGOARCHIVE

CORDIS identification number: 863320

The ``digital universe' of all known data worldwide is expected to grow to 250 Zettabytes by 2025. Unfortunately, all current storage media face fundamental limitations that threaten our ability to store, much less process, all this data. Hard Disk Drives (HDD) suffer from well-known scaling issues that have resulted in a meager 16% annual density improvement over the past decade compared to the 60% rate of data growth. Tape drives suffer from media obsolescence, as data stored in tape has to be continuously migrated to deal with technology upgrades. If we are to preserve even just a fraction of the world's data, we are in desperate need of a radically new storage media with substantially better density and durability characteristics. In this proposal, we focus on one such media that has received limited attention recently -synthetic Deoxyribonucleic acid (DNA). Using DNA as a digital storage media has multiple advantages. First, DNA is an extremely dense storage medium. Second, DNA can last several centuries; HDD and tape have life times of five and thirty years. Third, technology used for storing data on DNA (synthesis) and retrieving data back from DNA (sequencing) have eternal relevance; as long as there is life on earth, there will always be the need to synthesize and sequence DNA. Fourth, there is the potential to process the data stored in DNA using biomolecular mechanisms. Doing so is substantially faster and requires much less energy than traditional computing. Despite such benefits, DNA storage and DNA data processing are new areas of research. In this proposal, we outline a research agenda which will develop the fundamental technologies needed to build an intelligent DNA storage system. The resulting prototype system will support the full cycle of encoding data, synthesize it as DNA and read it back through sequencing. It will optimally store a variety of different types of data and enable near-data processing in the storage.