In September we featured a guest post from Bastian Greshake on how Open Science helped him advance his career. This blog post was in conjunction with the Winnower Writing Competition., an initiative designed to prove the benefits of data sharing and collaboration and how it can help researchers advance in their careers. We caught up with Bastian and interviewed him on openSNP, a public platform developed to give people a chance to donate their personal genetic data and his current challenges, explaining privacy concerns to the general public and the future of openSNP.
1. What is OpenSNP? (i.e. What does it do? Who is the target audience? Who can use it and how? What is it good for?)
The elevator pitch is: openSNP is a platform that gives people a chance to donate their personal genetic data into the public domain, alongside with phenotypic annotations. Citizen scientists, educators and everyone else can then access the data and use it for their ends. So we are targeting two audiences at the same time. On the one hand we want people to share their data and hopefully learn something useful in the process, on the other hand we want people to re-use the data for interesting projects.
For the first groupPublic Library of Science, SNPedia or the GWAS catalog. For the second group we offer different APIs in order to facilitate easy use of the data.we are offering annotations to the genetic variants, i.e. how does having this genetic variant affect me? To this end we mine external open data sources, like the
2. What is your background and who are the co-founders of OpenSNP?
I’m an evolutionary biologist, currently doing a PhD in bioinformatics. But that’s in a field completely unrelated to openSNP. I started openSNP together with Philipp Bayer (@PhilippBayer), who’s also a biologist-turned-bioinformatician, back in 2011. Since then we were joined by Helge Rausch (@helgerausch), who’s a very capable web developer, and Julia Reda (@senficon), who did help us with the social sciences-side of the project, before becoming a member of the European Parliament.
3. Why did you start OpenSNP?
Largely out of frustration over the data sharing situation back then. When I got access to my own genotyping data, I wanted to share it with the world, so that others could use it for their research. Unfortunately that wasn’t easily done: While the Personal Genome Project existed, they only accepted data from the US, and publishing data on personal websites makes it not easily re-usable. So I approached Philipp and asked whether he’d be interested in running a small repository for personal genomics data.
4. How and by whom are you supported?
So far we are self-supported by and large. We got some prize money from the Public Library of Science and Mendeley, as well as a small grant by Bayer a couple of years ago, but our regular expenses for running openSNP are paid for through with money earned from our day jobs, which is increasingly becoming a problem. With over 2000 data sets we are hitting the limits that our current IT infrastructure can handle. This is why we started a small Patreon campaign
5. How does OpenSNP fit into the discussion of privacy and consent for data sharing?
In a way openSNP is an experiment on genomic privacy and data sharing. By design, data uploaded to openSNP is publicly available without any restrictions: into the public domain. This means there are no restrictions to re-identification. With regard to the open-closed dimension, openSNP sets an example for extreme openness.
6. Have you received any negative response?
Initially many people are taken aback when they hear about openSNP and how the data is made available for everyone. Which we totally understand, it’s certainly not for everyone. But so far we haven’t gotten any negative responses by actual users who have published their data.
7. How do you explain privacy issues to general public who might use your platform?
I like to say that openSNP‘s Terms of Service are human-readable, not only lawyer-readable, to keep in line with the Creative Commons-spirit. When people are signing up we present them the possible consequences of sharing data with the whole world, for example losing health insurance, employment and so on. We also try to make clear that genetic knowledge isn’t static and new implications for someone’s data set might occur after they uploaded it. After all: one reason for sharing data is to generate new knowledge. So basically, our strategy is to scare away everyone who’s not 100% convinced that sharing will work for them.
8. What your current challenges in running OpenSNP?
Our largest challenge right now is to keep the infrastructure up and running. With the amount of data hosted on openSNP growing virtually every day, we will have to upgrade our servers rather sooner than later. This is why we hope that some people would like to chip in for those growing hosting costs
9. What will be the future of OpenSNP?
Hopefully openSNP will continue to grow at the current pace, even if it gives us some headaches from time to time. We also hope to add support for more data types – genetic and phenotypic – to make the whole platform even more useful.
If you want to read more about Bastian Greshake and openSNP, you can read his research article openSNP-A Crowdsourced Web Resource for Personal Genomics
Are you part of a project that facilitates data sharing for genomics research?
Would you like to be featured on our blog?