Guest post

Data Sharing 101 – a brief introduction for everyone

Written by Spencer Gibson, PhD, Research Associate at the University of Leicester. The DataSharing 101 site aims to be a launching pad for anyone interested in sharing data for biomedical purposes. The site started life as part of my work in Prof. Brookes’ research group at Leicester University, when working on the Genomic’s Clinic of the Future (GCoF) project. This was an E.U. horizon 2020 funded initiative to bring together scientists from different disciplines to investigate the various issues around sharing genomic data. My contribution to this project focused mainly on investigating current systems for sharing this type of information, and their applicability for sharing data between the clinical and research environments. The website became an avenue to disseminate this information to anyone who also wanted to explore this area further. As such the site aims to give some basic background with links out to other resources that either explore specific areas in greater detail or provide specific services or tools for data sharing. As the GCoF project focused on data sharing in a biomedical/clinical context, it allowed me to focus on a small number of key groups. These were the patient, the clinician/healthcare professional, the data user and the […]

Cancer Moonshot and the future of Precision Medicine

This guest post is by Emily Walsh, community outreach director at the Mesothelioma Cancer Alliance. The Mesothelioma Cancer Alliance, while focused on raising awareness for mesothelioma and bringing about the ban of asbestos, is working to shed some light on what this might mean for oncology patients in the near and far off future. In 2016, former US Vice President Joe Biden was given the task of heading the Cancer Moonshot Initiative by former President Barack Obama; an initiative that means paving the way for finding a cure for cancer by the year 2020. By no means an easy feat, the goal of the Moonshot program is to make existing treatments available to more people, improve our ability to prevent cancer, and explore ways to more easily detect it at an earlier stage. Since then, the Blue Ribbon Panel  was created with industry leaders, doctors, and advocates convening to discuss and present recommendations to make a decade of progress in only five years. After their conference, the panel presented a linear plan outlining 10 recommendations that provide clear goals for researchers, advocates, doctors, and other industry professionals. Included in the panel’s recommendations is the goal to build a national cancer data ecosystem. […]

Is sharing always caring? On open genomic data sharing and why people do it.

Originally published at Repositive blog and is reproduced with permission. Special thanks to Tobias Haeusermann1, (postdoctoral researcher at University of Zürich), and Bastian Greshake2 (co-founder of OpenSNP.org) for collaborating to write this guest blog post. In times of political turmoil, we tend to see discussions about the responsibility of science in academic circles. But unfortunately, something is rotten in the state of academia too. While the academic pursuit should, first and foremost, entail the cultural accumulation of knowledge and its transmission across generations and borders, the structures and strictures of science often tend to hinder rather than foster the sharing of knowledge. In their recent book “A Passion for Society: How We Think about Human Suffering”, sociologist Iain Wilkinson and medical anthropologist Arthur Kleinman openly address academia’s centuries-old dirty little secret: the barriers the ‘ideal’ dispassionate researchers erect around themselves is frequently selfish and self-serving. Oftentimes, Wilkinson and Kleinman write “what now passes as social science is in thrall to technocratic procedures and structures of career that leave it critically sterile, cynical and devoid of passion” (p. xi). They conclude that now might be the time to renegotiate the terms once again. As medical researcher John Tregoning lamented in his […]

Reflections from my time with the Genomic Data Commons

This post is by Piers Nash – biochemist, data evangelist and futurist. Originally published on his personal blog and is reproduced with permission. I have spent the past 13 years at The University of Chicago – most of that time as a faculty of Cancer Research. From December 2013 through January 2017 I spent my days (and a fair number of nights) working with the University of Chicago’s Center for Data Intensive Science. During that time, I had the unique privilege to be integrally involved in what I view as a truly transformational national effort to develop a purpose-built private object storage cloud for the Nations’ cancer genomic data. This is the National Cancer Institute’s Genomic Data Commons, or GDC. The team at the University of Chicago architected and built the GDC starting in 2013, and we launched on June 6, 2016 with none other than Vice President Joe Biden. With forward-looking technology, scale and use cases, we were in a position to make the project a centerpiece of the Cancer Moonshot Initiative. At launch, the GDC became the largest repository of harmonized cancer genomic data on Earth. As I depart the University of Chicago and leave this amazing project standing […]

From PhD to Product Management – Life in a startup

A post by Charlotte Whicher, Product Manager at Repositive Ltd. Originally published here and reposted with permission. In the last couple of weeks I have been invited to attend two events for academic researchers focusing on different non-academic career pathways and skills. I have been asked to talk about my career journey so far, and about combining science with business. This, alongside Nadia (our Scientific Liaison) asking “Do you think you follow a standard scientific career path or is your journey significantly different from others’?” in her recent interview of Aubrey de Grey for DNAdigest, has caused me to reflect a bit on my journey so far. What is this post about? Me. More explicitly, my career path to date, and the factors and decisions that lead me here. Why should you read it? If you are a researcher who is thinking of leaving academia, or if you are generally interested in the transition from academia to business, this might be interesting for you. What will you get from this post? You will learn the true story of my transition from working at a leading research institute to a job in a biotech startup. Hopefully, this will give you a […]

10 Simple Rules for Sharing Human Genomic Data

The Repositive team together with Springer Nature has recently formulated “10 simple rules for sharing human genomic data”. “These 10 Simple Rules have been developed from our combined experiences of working with human genomic data, data repositories and data users. We do not claim that these rules will eliminate every possible risk of data misuse. Rather, we hope that these will help researchers to increase the reusability of their human genomic data, whilst also ensuring that the privacy of their subjects is maintained according to their consent frameworks. Many of the principles presented are also applicable to other types of clinical research data, where participant privacy is a concern.” The manuscript by Manuel Corpas, Charlotte Whicher, Nadezda V. Kovalevskaya, Tom Byers, Amanda A. McMurray, and Fiona G.G. Nielsen of Repositive Ltd, Future Business Centre, Cambridge, UK, and Varsha K. Khodiyar of Springer Nature, London, UK, was originally submitted to Biorxiv.org Introduction Delivery of the promise of precision medicine relies heavily on human genomic data sharing. Sharing genome data generated through publicly funded projects maximises return on investment from taxpayer funds and increases the likelihood of obtaining funding in future rounds [1]. More importantly, genome data sharing makes it possible for […]

Walking the talk – reflections on working ‘openly’

As part of Open Access Week 2016, the University of Cambridge Office of Scholarly Communication published a series of blog posts on open access and open research. In this post, Dr Lauren Cadwallader discusses her experience of researching openly. Earlier this year I was awarded the first Altmetric.com Annual Research grant to carry out a proof-of-concept study looking at using altmetrics as a way of identifying journal articles that eventually get included into a policy document. As part of the grant condition I am required to share this work openly. “No problem!” I thought, “My job is all about being open. I know exactly what to do.” However, it’s been several years since I last carried out an academic research project and my previous work was carried out with no idea of the concept of open research (although I’m now sharing lots of it here!). Throughout my project I kept a diary documenting my reflections on being open (and researching in general) – mainly the mistakes I made along the way and the lessons I learnt. This blog post summarises those lessons. To begin at the beginning I carried out a PhD at Cambridge not really aware of scholarly best practice. […]

A new genome editing review from Nuffield Council on Bioethics

In September 2016, Nuffield Council on Bioethics presented Genome editing: an ethical review. The summary below is written by Jessica Cussins and is originally published here. Reposted with the author’s permission. 7 Highlights from Nuffield Council’s Review on the Ethics of Genome Editing Posted by Jessica Cussins, Biopolitical Times guest contributor on October 18th, 2016 The UK Nuffield Council on Bioethics’ recently released report, Genome Editing: an ethical review  (full version available here) is the most substantial and thorough assessment of its kind. It delves deeply into the ethical, social, and political underpinnings and implications of genome editing, and touches on related, converging technologies including synthetic biology, gene drives, and de-extinction. A second report with ethical guidance regarding the use of genome editing for human reproduction is due in early 2017 from a Council working group chaired by Karen Yeung. This first report will be an important reference for people across disciplines for some time, and I will not do justice to its scope and breadth here. However, I want to draw attention to just seven concepts that are particularly helpful and illuminating, as much for their framing of the questions at stake as for their content. I briefly summarize […]

A beginner’s guide to data sharing

Originally published on the Cogtales blog  and is reposted with the author’s permission. Science is becoming more and more open and transparent, and I think that’s awesome. An important aspect is sharing whatever information is necessary to reproduce results, usually that includes data and scripts. While open science can be beneficial for a researcher, this practice is still being met with some (justified) skepticism, but has become more and more accepted and common in research; in fact PLOS One for example made it a requirement for publication (how well that’s going is a different story). Funding agencies across the globe are quickly following suit, so chances are high you either already have to or will in the near future think about data sharing. But what does it entail? There are several issues that in my view do not receive enough attention, and that add unnecessary hurdles in the sharing and re-use of data. But first, let me get this out of the way: sharing data is great, but you should do it the right way. If you succeed, you will not only help the community, but also yourself by making your work more visible and even citable. This way you get credit for your […]

Get the most out of your impact data

It’s time to put our impact data to work to get a better understanding of the value, use and re-use of research. Published under CC BY 3.0 license. Originally Published by Liz Allen, PhD on the London School of Economics and Political Science Blog If published articles and research data are subject to open access and sharing mandates, why not also the data on impact-related activity of research outputs? Liz Allen argues that the curation of an open ‘impact genome project’ could go a long way in remedying our limited understanding of impact. Of course there would be lots of variants in the type of impact ‘sequenced’, but the analysis of ‘big data’ on impact, could facilitate the development of meaningful indicators of the value, use and re-use of research. We know that research impact takes many forms, has many dimensions and is not static, as knowledge evolves and the opportunities to do something with that knowledge expand. Over the last decade, research institutions and funding agencies have got good at capturing, counting and describing the outputs emerging from research. A lot of time and money has been invested by funding agencies to implement grant reporting platforms to capture the myriad outputs and products of research (e.g. […]

Genealogy and genomics take their vows

Guest post by Brianne Kirkpatrick, MS, LGC, genetic counselor. Genomics research and genealogy have been dating for a few years now, and it seems that 2015 was the year they finally took their vows. With the growth of interest in tracing familial lineages — genealogy being the second-most favorite hobby reported by Americans — the technologies created for searching historical records of families are available instantly, with a mouse click or a screen swipe. Engagement in family history collection and availability of commercial DNA testing for ancestry are galvanizing the general public alongside the growth of genomics databases in research and industry. Growing interest in uncovering ethnic roots and genetic family has opened the doors for novel research projects, leading to a new cohort of willing and able participants. Some readers might already be familiar with the DNA.Land project, a non-profit partnership between New York Genome Center and Columbia University. DNA.Land is accepting raw genotype data files from participants who were able to obtain these files by purchasing commercial ancestry testing. Unaffiliated with the testing companies themselves, DNA.Land provides a re-analysis of the computerized genotype data and provides additional tools, such as a participant-matching database and ethnicity estimate. This research project provides an opportunity […]

Objections to data sharing don’t stand up to scrutiny

Who’s afraid of Open Date: Scientists’ objections to data sharing don’t stand up to scrutiny. Many scientists are still resisting  data sharing calls. Whilst their concerns should be taken seriously, Dorothy Bishop doesn’t think the objections withstand scrutiny. Concerns about being scooped are frequently cited, but are seldom justified. If we move to a situation where a dataset is a publication, then the original researcher will get credit every time someone else uses the dataset. And in general, having more than one person doing an analysis is an important safeguard for science. I was at a small conference last year, catching up on gossip over drinks, and somehow the topic moved on to journals, and the pros and cons of publishing in different outlets. I was doing my best to advocate for open access, and to challenge the obsession with journal impact factors. I was getting the usual stuff about how early-career scientists couldn’t hope to have a career unless they had papers in Nature and Science, but then the conversation took an interesting turn. “Anyhow,” said eminent Professor X. “One of my postdocs had a really bad experience with a PLOS journal.” Everyone was agog. Nothing better at conference drinks than a new twist on […]

Genomic data sharing: How much oversight is necessary

This is a guest blog post by Mahsa Shabani, LL.B., LL.M., MA., a PhD Candidate at the Center for Biomedical Ethics and Law, University of Leuven. Her research interests revolve around ethical, legal and social aspects of genetics and genomics research including governance of biobanks and global collaborative genomics research and data sharing. Originally published in the Bill of Health blog Introducing data sharing practices into the genomic research has brought a number of concerns in research ethics and governance to the fore. For instance, research participants and the general public raised concerns about potential privacy issues in personal genomic data protection, as well as the scope of the secondary uses. In order to address such concerns, Data Access Committees (DACs) were seen crucial in the governance of main genomic databases such as the database of Genotypes and Phenotypes (dbGaP) and the European Genome-phenome Archive (EGA). Surprisingly, the component of access review, the structure, and the functionality of such committees have been barely scrutinized to date. In a recent study published in Genetics in Medicine, we solicited the opinion of 20 DAC members and experts on genomic data access. Specifically, the interviewees were asked about the goals of access review and their experiences with reviewing the ethical and scientific aspects of […]

GenomeConnect: connecting patients and researchers

This is a guest post by the GenomeConnect team. Patients with new genetic diagnoses are increasingly turning to social media and other web resources to try and find other families with the same genetic diagnosis and research opportunities. GenomeConnect, an online patient registry developed as part of the National Institutes of Health funded Clinical Genome Resource (ClinGen) project, is a resource to help patients form connections and partner with researchers to make genomic advances possible. Participation and enrollment in GenomeConnect are open to anyone that has had genetic testing, regardless of diagnosis or test result.  Additionally, participation is completely online allowing individuals from around the world to participate. After completing the online consent process, participants are asked to complete a health survey that reviews each body system to capture basic health information. From there, participants are asked to upload their genetic testing report to allow GenomeConnect staff to capture important genomic information. After participants have shared their genetic and health information through the online portal, that information is prepared for de-identified sharing with approved, publicly available databases, such as NCBI’s ClinVar database, a repository for genomic variants. Once enrolled, GenomeConnect participants have the ability to match with one another via […]

What Open Access is and what it is not

This is a guest post by Nancy Pontika, Open Access Aggregation Officer at the COnnecting REpositories (CORE) project, Knowledge Media Institute, Open University. What is Open Access and why is it useful? The scholarly communications landscape is constantly changing. Printed journals have been replaced with electronic publications; authors refrained from using strict copyright rules, such as  “All Rights Reserved” licenses, and shifted to the use of licenses with more flexible rights that allow content re-use, like Creative Commons; finally, creators of scientific content are more willing than ever to share their research findings from their own computers with everyone in the world. These three aforementioned components constitute the definition of “open access” (OA), which is the movement that aims to disseminate digital scientific content online and free of cost, with limited or no rights restrictions. Established by the Budapest Open Access Initiative  (BOAI) in 2002, OA can be delivered via two main routes: open access journals (Gold OA) and repositories (Green OA); the latter are further divided into two main categories of subject and institutional repositories. OA attempts to provide a viable solution to the journal crisis and the constantly increasing subscription prices of scientific journals, which rise faster than […]

Why do I want to share my genetic data?

This is a guest post by Craig Macpherson about why he wants to share his genetic data. He is the founder and editor of DNA Testing Choice, a reviews site for the DNA tests you can take at home.  Why have I recently had my Whole Genome Sequencing (WGS) done, and why do I want to make this data publicly available?  Let me take you back a few years so I can answer this question… Although I founded DNA Testing Choice two years ago, I actually came up with the idea in 2010 when I read an article in The Times. A journalist had taken three home DNA tests to establish his genetic predisposition to glaucoma. The results of these tests were broadly similar, but the interpretation of his genetic variants differed significantly.  This raised two interesting questions for me, 1) ‘how would I work out which home DNA test to take, given the complex nature of the service?’, and 2) ‘how would I verify the interpretation I received?’  The first question inspired the site, the second inspired me to buy my WGS. It seemed to me that those taking home DNA tests should be able to separate the raw genetic […]

What is so great about ICD -10?

This is a guest post about ICD -10 by Laura O’Donnell who writes on behalf of EHR, electronic health record experts at OmniMD. As of October 1, 2015 the International Statistical Classification of Diseases and Related Health Problems (ICD) is effectively now in ‘round 10.’ This means that all providers covered by HIPAA (the Health Insurance Portability Accountability Act) are required to make the transition from ICD -9 to ICD-10. It is anticipated that ICD-10’s contribution to precise and meaningful data integration and sharing – across the industry as well as our institutes of research – will further our understanding of medical complications and clarify the connection between a patient’s condition and their physician’s performance. How does sharing data affect research? Applying standard ontologies (vocabularies) to organize and leverage the power of shared knowledge across various disciplines has significantly changed the face of research. As regards to this post, the medical industry’s adoption of ICD language has enabled us to discover and react to patterns affecting public health. For example, prognosis research is using data sharing to focus on future outcomes for patients with particular diseases. Physicians, clinics, hospitals and pharmaceutical companies now combine their respective knowledge with actual patient data. […]

Data sharing to support UK clinical genetics and genomics services

This is a guest post by Sobia Raza – a policy analyst specialising in data science at the PHG Foundation. Originally published here under the title “Responsible, proportionate data sharing for better and safer genetic services”. Introduction The PHG Foundation, are a health policy organisation with a focus on how genomics and other emerging health technologies can provide more effective, personalised healthcare. The Association for Clinical Genetic Science (ACGS), are the professional association for clinical genetics scientists in the UK. The two organisations have recently collaborated to deliver a joint report which examines the challenges to data sharing within UK clinical genetics and genomics services and to identify priority areas for policy development. The report underscores how data sharing is essential to the delivery of NHS clinical genetics services and the clinical care of patients. Access to high quality data on genomic variants can not only inform the diagnosis and clinical management of patients, but also reduce the risk of potential misdiagnoses arising from insufficient or incorrect information about these variants. Other serious consequences of sub-optimal data sharing are delays in patient diagnosis and variations in the quality of testing services. Yet despite the clinical importance of data sharing, current […]

Highlights of 2015: Blog Posts & Publications

2015 was a great year for DNAdigest! We organised more events, welcomed more volunteers to the team and increased our output of online communications and blog generation! It has been a joy to watch our followers and online community grow with us and as we approach the end of the year, we want to dedicate this blog post to looking back over 2015 and celebrate the achievements we made together!   Blog Posts: April 8th – ‘Genomic Data Sharing – Ethical and Scientific Imperative’ We chose this guest post by Mahsa Shabani because it was one of the most popular blog posts from 2015. Here Mahsa discusses how sharing data via controlled-access databases has been seen as an answer to the identified privacy and legal complications of sharing data. While the structure, membership and procedure of access review varies across DAC’s, Mahsa warns that such access review mechanisms have rarely received attention. By establishing adequate oversight mechanisms on data sharing, progressive and reposonsible data use will be on the horizon. Read more . . . July 8th – ‘The Sharers’ Leaderboard: an h-index of data sharing’ We chose this guest post by Kate Hodesdon from Seven Bridges Genomics because it discusses the possibly of codifying best practices of […]

Why we should stop talking about data sharing

This is a guest post by Barbara Prainsack. Barbara Prainsack is a Professor at the Department of Social Science, Health & Medicine at King’s College London. She has published widely on social, ethical and regulatory issues related to genomic research and medicine. A book (with Alena Buyx) on Solidarity in Biomedicine and Beyond, which includes a case study on database governance, will be published by Cambridge University Press next year. Barbara is in the process of finalising a monograph on Personalization from Below: Participatory Medicine in the 21st Century (under contract with New York University Press). A lot of people who promote data sharing – including the people behind DNAdigest – are doing great things; they devote their time to finding ways to utilise and re-use data in ways that promote disease research, advance knowledge, and create public benefits. The people behind these initiatives, and those who contribute their own data to them (see for example DNAland, OpenSNP, Genes for Good which are all initiatives aiming at data sharing for public benefit) are pioneers in creating social value. At the same time, some of the voices in the choir of those who call for data sharing belong to commercial companies. […]

The key elements of good data sharing practice

This is a guest post by Wellcome Trust. Originally published on blog.wellcome.ac.uk The Wellcome Trust is a leading partner in the Public Health Research Data Forum, which brings together research funders who are committed to increasing the sharing of health research data in ways that are equitable, ethical and efficient and will accelerate improvements in public health. On behalf of the Forum, the Trust funded a major international study of stakeholders’ views about best practices for sharing public health research data from low and middle income settings, which recently published its results. Dr Susan Bull and Prof Michael Parker, from The Ethox Centre, University of Oxford, discuss the key issues and findings of the study. Data-sharing is increasingly seen as an important component of effective and efficient biomedical research – both by researchers, and research funders. At the same time, it is recognised that efforts to increase access to individual-level data raise important ethical and governance challenges, some of which may vary depending on the context in which the research takes place. The primary argument in favour of more routine sharing of de-identified research data is its potential to generate more – and higher quality – science. This could in turn lead to improved health outcomes, and promoting […]

Information management: to federate or not to federate

This is a guest post by Yasmin Alam-Faruque, member of Eagle Genomics’ Biocuration team. Originally published on eaglegenomics.com Information management is a key organisational activity that concerns the acquisition, organisation, cataloguing and structuring of information from multiple sources and its distribution to those who need it. From a scientist’s perspective, experimental results are the most important pieces of information that are analysed and interpreted to make new biological discoveries. Unless you are the one generating the results, it is not always an easy task to find and gather all other relevant datasets and documents that you need for further comparison and analyses. What is the current approach? Currently, sharing of data between researchers is a manual and complex process, which causes inefficiency since a significant fraction of researcher time is spent on this activity. New high-throughput technologies generating huge datasets are compounding the problem. We argue that new information management approaches based on data federation can help address this problem, thus leading to quicker analyses and discovery of new biological insights. Data federation is a form of data consolidation, whereby data is collected from distinct databases without ever copying or transferring the original data itself. It combines result sets from across multiple source systems and […]

How research data sharing can save lives

This is a guest post by Trish Groves, head of research at The BMJ.  Originally published on BMJ.com website. Everyone’s been missing a trick. The whole debate on sharing clinical study data has focused on transparency, reproducibility, and completing the evidence base for treatments. Yet public health emergencies such as the Ebola and MERS outbreaks provide a vitally important reason for sharing study data, usually before publication or even before submission to a journal, and ideally in a public repository. Not just from randomised controlled trials, but from case series and samples, lab testing studies, surveillance studies, viral sequencing, genomic work, and other epidemiological observational studies too. During the Ebola crisis, researchers couldn’t or wouldn’t share data. Last week WHO held a consultation meeting in Geneva to tackle this. One big reason for withholding data was the mostly unfounded fear of having subsequent papers rejected by journals. But researchers capturing vital information in the field and in coordinating centres were too busy to write and submit those papers, and thus much time was lost before vital information could be disseminated. Did people die because of the Ingelfinger rule against prior publication? There were also, of course, some commercial disincentives to early data sharing, with […]

ReScience: ensuring that the original research is reproducible

Reproducibility is a cornerstone of science: the results obtained by researcher A must be identical to the results obtained by researcher B provided they follow identical protocols and use identical reagents. In reality, multiple factors can lead to irreproducible results. They include poor training of researchers in experimental design; increased emphasis on making provocative statements rather than presenting technical details; and publications that do not report basic elements of experimental design. Therefore, the initiatives working on the reproducibility issues are indispensable for the scientific progress. We are happy to present this guest post by Nicolas Rougier from ReScience – a peer-reviewed journal that targets computational research and encourages the explicit replication of already published research, promoting new and open-source implementations in order to ensure that the original research is reproducible. The ReScience initiative In March 2015, Nicolas Rougier and his colleagues published a commentary into the “Frontiers in Computational Neuroscience” journal that highlighted the difficulties they encountered when trying to replicate a model from the literature. Sources were not available on a public repository (they needed to be requested from one of the author), code was not under version control, there were some factual errors and ambiguities in the description […]

When Counting is Hard: the Making Data Count project

This is a guest post by Jennifer Lin, project manager for the Making Data Count project. Originally published here. Counting is hard. But when it comes to research data, not in the way we thought it was (example 1, example 2, example 3. The Making Data Count (MDC) project aims to go further – measurement. But to do so, we must start with basic counting: 1, 2, 3… uno, dos, tres… MDC is an NSF-funded project to design and develop metrics that track and measure data use, “data-level metrics” (DLM). DLM are a multi-dimensional suite of indicators, measuring the broad range of activities surrounding the reach and use of data as a research output. Our team, made up of staff from the University of California Curation Center at California Digital Library, PLOS, and DataONE, investigated the validity and feasibility of using metrics by collecting and investigating the use of harvested data to power discovery and reporting of datasets that are part of scholarly outputs. To do this, we extended Lagotto, an open source application, to track datasets and collect a host of online activity surrounding datasets from usage to references, social shares, discussions, and citations. During this pilot phase we […]

How doing Open Science has helped advance my career

Last week we sent details of how to win $1,000 in The Winnower open science writing competition. This week we bring you a blog post from Bastian Greshake, one of the participants in the competition. Bastian’s story shows how supporting open genetic data access had a lasting impact on his academic career, contributed to lots of new skills, led to winning awards and helped him find jobs and collaborators. Bastian Greshake, co-founder of OpenSNP. What Have I Done?! There are many firm believers in the different kinds of openness: open access, open source, open data, open science, open you-name-it. And at least to me, some of the most interesting things happen at the intersection of those different opens. Which probably is where openSNP – the project I co-founded in 2011 – can be located. It’s an open source project which tries to crowdsource collecting open genetic data. This is done by enabling people to donate their personal genetic information into the public domain, alongside phenotypic annotations. And for good measure we also factor in open access, by text mining the Public Library of Science and other open databases for primary literature. What started as a somewhat freakish idea in 2011 has by mid–2015 […]

Blockchain and Digital Health – First Impressions

Guest Post by Rodrigo Barnes, Chief Technology Officer at Aridhia. This blog post was originally published on the Aridhia website on 25 August 2015. The blog post was inspired by the Ethereum Workshop at the Turing Festival in Edinburgh. Among the many great Edinburgh festivals, the Turing Festival is the most important to the tech start-up scene locally and beyond. This weekend, I attended the Ethereum Workshop to learn about a type of “blockchain” technology and to think about how it might facilitate innovation in digital health. There’s even interest in this for genomic data sharing, as the Global Alliance and Kaiser Permanente’s John Mattison has suggested. Most people in tech have heard of Bitcoin, the cryptocurrency that is exciting libertarians and central bankers alike. One thing I learned this weekend is that, at its heart, Bitcoin and related technologies can be seen as essentially ‘open ledgers’ where transactions are recorded in a very public way, and can’t be repudiated. The gist of this is that the open ledger can be trusted, even though because of the way it is implemented, there is no central authority vouching for it. The system of maintaining the ledger is the decentralised processing of the blockchain. The question I asked myself is “how could this be applied to digital […]

sharing

The Sharers’ Leaderboard: an h-index for data sharing

The idea for this guest post by Kate Hodesdon of Seven Bridges Genomics grew out of a discussion with Adam Resnick (Children’s Hospital of Philadelphia) and Deniz Kural (Seven Bridges Genomics). There is widespread recognition that sharing data benefits science. In this article, I’ll examine the best practices of data sharing, and assess the prospects for codifying these into a metric for how well scientists share data. When scientists say that sharing data is good for science, they have certain models of sharing and certain kinds of data in mind. I want to look at what makes someone a good sharer of data. For instance, simply being a prolific sharer is useless if the quality or relevance of the data is poor. And sharing high-quality data is not helpful if it you store it in an insecure repository, or an obscure format. Clarifying best practices of data sharing will help us maximize the value of shared data, but it can also play another important role of helping to incentivize data sharing. The problem of incentivization is that while data sharing undoubtedly benefits scientific progress, it is only beneficial to individuals if they can take advantage of another’s shared data. In […]

The Hyve

Open Source Technologies for Precision Medicine

The Hyve is a 30 person open source bioinformatics services company from Utrecht, Netherlands, and Cambridge, MA, USA. DNAdigest invited them to write a blog post on the summit “Open Source Technologies for Precision Medicine” that they organised in the beginning of June 2015 together with a life science consultancy Proventa International. The summit “Open Source Technologies for Precision Medicine” took place in London on June 03 and had good attendance from both industry and academia. The round table discussions and the panel discussion lead by Keith Elliston (CEO, tranSMART Foundation), John Wise (Executive Director, Pistoia Alliance), Paul Avillach (Assistant Professor, Harvard Medical School), Jay Bergeron (Director Translational & Bioinformatics, Pfizer), Gerrit Meijer (Professor, Netherlands Cancer Institute) and Kees van Bochove (CEO, The Hyve) resulted in some constructive conclusions about the current situation with open source as a means to achieve precision medicine. The main conclusions from the meeting: The ultimate business driver for adopting open source technologies such as tranSMART and cBioPortal seems to be access to data. For pharma IT, this especially means easy access to data from academics and non-profits, as well as annotated public studies and studies from (public-private) collaborations. Of course there is also the […]

ecology concept

Ecological Perspective on Data Sharing

We have invited Charlie Outhwaite (@charlielouo) to write a guest blog post on the topic of openness and data sharing from an ecological point of view. The post give us the great opportunity to draw a parallel on how the same type of data sharing problems we are experiencing in the field of genomics are observed across different scientific disciplines. The field of ecology is a vast and varied one. As a result, the types and quantities of data produced differ hugely.  Whether a study is small in scale, such as a field or lab based project, or a large, country or global scale, big data study: the amount of data that could be made available is enormous.  Yet the field of ecology has been considered as behind in terms of its openness when compared to other areas of biology such as genomics. With such vast amounts and types of data available, sharing that data openly has the potential to boost research opportunities and open up collaboration within and between fields. As is the case within many scientific disciplines, a major barrier for data sharing in ecology is the fear of being scooped. For this reason, many researchers would be unlikely […]

AliveAndKickin

The War Against Cancer

This is the second part of a guest blog post written by Dave Dubin. Read the first part here. Let’s see how far we’ve come… Since 2007, we have written, made appearances, and held events in order to bring awareness to what I have gone through with the goal of reaching a younger audience, including women.  Every conference we’ve attended has talked about the under-fifty age group, whose numbers are increasing every year, as well as genetics and genetic testing, even genomics.Soccer is a worldwide game, with as many women participating as men, and it encompasses all ages and levels of ethnicity and socioeconomic background.  We have had events with women’s and men’s professional soccer teams from the National Women’s Soccer League, Major League Soccer, North American Soccer League and college teams. My voice has been used to narrate videos and I’ve moderated a webinar about knowing your family history. Today, cancer is as much about finding the proverbial “needle in the haystack” as it is about curing and prevention. Immunotherapy is a big topic. If a group of individual family members all get a certain genetic mutation, and only two of the three siblings are affected by the disease, […]

Pickard family

Family Trio Sequencing – Genetic Clues in Autism

This is a guest blog post written by KT Pickard (@kthomaspickard) and Kimberly Pickard (@kimberlypickard), Co-founders of StartCodon. Amazingly, the cost of whole genome sequencing is now 100,000 times less expensive than it was a dozen years ago. If the Tesla Model S followed this trajectory, you could buy one today for less than $1 USD. This super logarithmic decline puts genomics on par with desktop publishing or 3D printing—it has become something that you can affordably do yourself. My wife, Kimberly, and I were excited about the prospect of having our genomes sequenced. Our daughter has autism, and like many parents of special needs children, we were eager to explore the underlying causes of her condition. We “got genomed” last year by enrolling in Illumina’s Understand Your Genome program. We received our whole genome sequencing (WGS) data, as well as limited predisposition and carrier screening for a number of Mendelian traits. As many DNAdigest readers know, the cost of WGS continues to drop in price, almost to the $1,000 genome that Illumina announced last year. Kimberly and I were intrigued to learn that we were both carriers of some rare genetic variants. Could our genetic idiosyncrasies be contributing to our […]

aliveandkickin

A patient advocate for cancer research

This is the first part of a guest blog post written by Dave Dubin. Read the second part here. 1997 seems so far away.  I’m 29, still a strapping 200 plus pounds, playing soccer, managing the business, recently married with first house and first son.  As much as “family history of colon cancer” is written all over the chart, I’m sent away by my primary physician when I have symptoms.  A few months later, symptoms of blood in the stool and cramping don’t go away.  A gastroenterologist finally confirms stage three colon cancer.  I have what will become the first of several surgeries at Mt Sinai Hospital in Manhattan, and the start of what would become much more than a patient-doctor relationship with Gastroenterologist Blair Lewis and Brian Katz, my surgeon. Three years after my surgery, my older brother develops colon cancer.  Since he started getting screened by Blair Lewis after my episode, his is caught earlier.  Brian Katz is his surgeon as well, and since laparoscopic surgery is now more prevalent at Mt Sinai, his is less invasive and scars are smaller.  No chemo.  I notice how my parents have a difficult time watching their son go through this.  […]

Genomic Data

Genomic Data Sharing – Ethical and Scientific Imperative

This is a guest blog post writen by Mahsa Shabani (@Mahsashabani). Genomic data sharing has become an ethical and scientific imperative in the recent years. Funding organizations, research institutes and journals among others, endorsed the significance of data sharing practices to the progress of research and an optimal use of community resources. Consequently, researchers all around the world are extensively involved in the data sharing process, ranging from data production to data use. As sharing practices do involve individuals’ data, the associated ethical and legal concerns should receive thorough attention in order to respect individuals’ rights and maintain public trust. Sharing data via controlled-access public databases has been seen as an answer to the identified concerns at the moment. Data Access Committees (DACs) constructed locally or in a central fashion control access to these datasets according to defined criteria. Evaluating the qualification/eligibility of data users, ethical and scientific grounds of proposed uses and oversight on downstream data uses are considered as the main responsibilities of DACs. While the structure, membership and procedure of access review vary across DACs, some similarities in approaches and mechanisms are observed. A requirement of preparing a summary of data use and signing a data access agreement […]

Anna Middleton

Involving participants in genomics research

Guest post by Dr Anna Middleton, Senior Staff Scientist, Wellcome Trust Sanger Institute. This blog post was originally published by the Nuffield Council on Bioethics. The blog post is based on the talk Dr Middleton gave at the launch of the Council’s report: The collection, linking and use of data in biomedical research and health care: ethical issues. My career has explored, from multiple different perspectives, the impact of genomics on people. Genomics refers to the study of a person’s 20,000 or so genes. Given the almost infinite ways that people can be genetically different to each other, genomic research often needs to be done on a very large scale in order to be able to interpret the significance of findings, particularly a rare genetic change. So, Big Data and Genomics go hand in hand. To give you an example, I’m currently part of the Deciphering Developmental Disorders (DDD) project at the Sanger Institute which seeks to offer cutting edge genomic testing to 12,000 children from the NHS with severe, complex, physical and/or intellectual disability. These children have exceptionally rare conditions that their doctors may never have seen before. Using an online database that contains large sets of health and biological […]

The Patient Charter

Genome sequencing: What do patients think?

Guest post by Alice Hazelton from the Genetic Alliance UK Genomic information has the potential to transform healthcare. Researchers are continually learning more about the genome and the genetic basis of disease and as the cost of genome sequencing technologies and analytics tools decrease, more and more research will become possible. This will help us to achieve a greater understanding of how our genes affect our health and develop new diagnostic tools, screening methods and treatments for some conditions. The sharing of patient data will play a crucial role in this. Whenever the sharing of patient data is discussed, public debate ensues over concerns about data security, privacy and access. But little work has been done to establish what patients, as the end-beneficiaries of medical research, think about sharing data. Through an online engagement project, ‘My Condition, My DNA’, Genetic Alliance UK sought the views of patients affected by rare and genetic conditions, both diagnosed and undiagnosed, on genome sequencing. Four sessions including text, podcasts, videos and questions were distributed to patients over the course of four weeks, allowing them to take part in their own homes at a time convenient for them. One of these sessions was about the use […]

research data

Giving research data the credit it’s due

Guest post by Sarah H Carl (@sarahhcarl) In many ways, the currency of the scientific world is publications. Published articles are seen as proof – often by colleagues and future employers – of the quality, relevance and impact of a researcher’s work. Scientists read papers to familiarize themselves with new results and techniques, and then they cite those papers in their own publications, increasing the recognition and spread of the most useful articles. However, while there is undoubtedly a role for publishing a nicely-packaged, (hopefully) well-written interpretation of one’s work, are publications really the most valuable product that we as scientists have to offer one another? As biology moves more and more towards large-scale, high-throughput techniques – think all of the ‘omics – an increasingly large proportion of researchers’ time and effort is spent generating, processing and analyzing datasets. In genomics, large sequencing consortia like the Human Genome Project or ENCODE  were funded in part to generate public resources that could serve as roadmaps to guide future scientists. However, in smaller labs, all too often after a particular set of questions is answered, large datasets end up languishing on a dusty server somewhere. Even for projects whose express purpose is […]

Top