The work of DNAdigest has recently been featured in PLoS Biology

Our team was invited to contribute to the Community Pages of PLoS Biology.

We drew attention to the fact that only a small fraction of sequencing data is present in the repositories and can be accessed and re-used in research (Fig. 1).


Fig. 1. Whereas ~80 petabytes of sequencing data is generated every year,
only ~0.5 petabytes is accessible via repositories.

This gap between the availability of genomic information and the production of it can be at least partially attributed to the absence of tangible benefits for the individuals who make data available and, at the same time, to the existence of sanctions for improper handling of personal information. However, when data donors give consent for their data to be used for research, they set their expectations that the data will actually be used for this purpose. To not utilise their data in the best possible way within the consent given goes against the data donor’s interests and expectations.

Ironically, human genomic data is probably the most important data to share, since it lies at the heart of efforts to combat major health issues such as cancer, genetic diseases, and genetic predispositions for complex diseases like heart disease and diabetes. In particular, the promise of personalised medicine (in which treatment is tailored to the individual) is unlikely to be realised without widespread access to large amounts of genomic data.

In this paper, we included a list of ~30 genomic data sources which, we believe, could be useful for researchers.

Read our Open Access paper here and please share it with people who need genomic data for their research!

Are you part of a project that facilitates data sharing for genomics or other related research?

Are you directly or indirectly involved in the Open Science movement?

Would you like to be featured on our blog?

We would love to hear from you.

Write to us at or use our contact page to get in touch.