Month: October 2015

The key elements of good data sharing practice

This is a guest post by Wellcome Trust. Originally published on The Wellcome Trust is a leading partner in the Public Health Research Data Forum, which brings together research funders who are committed to increasing the sharing of health research data in ways that are equitable, ethical and efficient and will accelerate improvements in public health. On behalf of the Forum, the Trust funded a major international study of stakeholders’ views about best practices for sharing public health research data from low and middle income settings, which recently published its results. Dr Susan Bull and Prof Michael Parker, from The Ethox Centre, University of Oxford, discuss the key issues and findings of the study. Data-sharing is increasingly seen as an important component of effective and efficient biomedical research – both by researchers, and research funders. At the same time, it is recognised that efforts to increase access to individual-level data raise important ethical and governance challenges, some of which may vary depending on the context in which the research takes place. The primary argument in favour of more routine sharing of de-identified research data is its potential to generate more – and higher quality – science. This could in turn lead to improved health outcomes, and promoting […]

Information management: to federate or not to federate

This is a guest post by Yasmin Alam-Faruque, member of Eagle Genomics’ Biocuration team. Originally published on Information management is a key organisational activity that concerns the acquisition, organisation, cataloguing and structuring of information from multiple sources and its distribution to those who need it. From a scientist’s perspective, experimental results are the most important pieces of information that are analysed and interpreted to make new biological discoveries. Unless you are the one generating the results, it is not always an easy task to find and gather all other relevant datasets and documents that you need for further comparison and analyses. What is the current approach? Currently, sharing of data between researchers is a manual and complex process, which causes inefficiency since a significant fraction of researcher time is spent on this activity. New high-throughput technologies generating huge datasets are compounding the problem. We argue that new information management approaches based on data federation can help address this problem, thus leading to quicker analyses and discovery of new biological insights. Data federation is a form of data consolidation, whereby data is collected from distinct databases without ever copying or transferring the original data itself. It combines result sets from across multiple source systems and […]

How research data sharing can save lives

This is a guest post by Trish Groves, head of research at The BMJ.  Originally published on website. Everyone’s been missing a trick. The whole debate on sharing clinical study data has focused on transparency, reproducibility, and completing the evidence base for treatments. Yet public health emergencies such as the Ebola and MERS outbreaks provide a vitally important reason for sharing study data, usually before publication or even before submission to a journal, and ideally in a public repository. Not just from randomised controlled trials, but from case series and samples, lab testing studies, surveillance studies, viral sequencing, genomic work, and other epidemiological observational studies too. During the Ebola crisis, researchers couldn’t or wouldn’t share data. Last week WHO held a consultation meeting in Geneva to tackle this. One big reason for withholding data was the mostly unfounded fear of having subsequent papers rejected by journals. But researchers capturing vital information in the field and in coordinating centres were too busy to write and submit those papers, and thus much time was lost before vital information could be disseminated. Did people die because of the Ingelfinger rule against prior publication? There were also, of course, some commercial disincentives to early data sharing, with […]

DNAdigest interviews OpenSNP

In September we featured a guest post from Bastian Greshake on how Open Science helped him advance his career. This blog post was in conjunction with the Winnower Writing Competition., an initiative designed to prove the benefits of data sharing and collaboration and how it can help researchers advance in their careers. We caught up with Bastian and interviewed him on openSNP, a public platform developed to give people a chance to donate their personal genetic data and his current challenges, explaining privacy concerns to the general public and the future of openSNP. 1. What is OpenSNP? (i.e. What does it do? Who is the target audience? Who can use it and how? What is it good for?) The elevator pitch is: openSNP is a platform that gives people a chance to donate their personal genetic data into the public domain, alongside with phenotypic annotations. Citizen scientists, educators and everyone else can then access the data and use it for their ends. So we are targeting two audiences at the same time. On the one hand we want people to share their data and hopefully learn something useful in the process, on the other hand we want people to re-use the data for interesting projects. For the first group, […]

ReScience: ensuring that the original research is reproducible

Reproducibility is a cornerstone of science: the results obtained by researcher A must be identical to the results obtained by researcher B provided they follow identical protocols and use identical reagents. In reality, multiple factors can lead to irreproducible results. They include poor training of researchers in experimental design; increased emphasis on making provocative statements rather than presenting technical details; and publications that do not report basic elements of experimental design. Therefore, the initiatives working on the reproducibility issues are indispensable for the scientific progress. We are happy to present this guest post by Nicolas Rougier from ReScience – a peer-reviewed journal that targets computational research and encourages the explicit replication of already published research, promoting new and open-source implementations in order to ensure that the original research is reproducible. The ReScience initiative In March 2015, Nicolas Rougier and his colleagues published a commentary into the “Frontiers in Computational Neuroscience” journal that highlighted the difficulties they encountered when trying to replicate a model from the literature. Sources were not available on a public repository (they needed to be requested from one of the author), code was not under version control, there were some factual errors and ambiguities in the description […]