Craig Smith

Get the most out of your impact data

It’s time to put our impact data to work to get a better understanding of the value, use and re-use of research. Published under CC BY 3.0 license. Originally Published by Liz Allen, PhD on the London School of Economics and Political Science Blog If published articles and research data are subject to open access and sharing mandates, why not also the data on impact-related activity of research outputs? Liz Allen argues that the curation of an open ‘impact genome project’ could go a long way in remedying our limited understanding of impact. Of course there would be lots of variants in the type of impact ‘sequenced’, but the analysis of ‘big data’ on impact, could facilitate the development of meaningful indicators of the value, use and re-use of research. We know that research impact takes many forms, has many dimensions and is not static, as knowledge evolves and the opportunities to do something with that knowledge expand. Over the last decade, research institutions and funding agencies have got good at capturing, counting and describing the outputs emerging from research. A lot of time and money has been invested by funding agencies to implement grant reporting platforms to capture the myriad outputs and products of research (e.g. […]

Objections to data sharing don’t stand up to scrutiny

Who’s afraid of Open Date: Scientists’ objections to data sharing don’t stand up to scrutiny. Many scientists are still resisting  data sharing calls. Whilst their concerns should be taken seriously, Dorothy Bishop doesn’t think the objections withstand scrutiny. Concerns about being scooped are frequently cited, but are seldom justified. If we move to a situation where a dataset is a publication, then the original researcher will get credit every time someone else uses the dataset. And in general, having more than one person doing an analysis is an important safeguard for science. I was at a small conference last year, catching up on gossip over drinks, and somehow the topic moved on to journals, and the pros and cons of publishing in different outlets. I was doing my best to advocate for open access, and to challenge the obsession with journal impact factors. I was getting the usual stuff about how early-career scientists couldn’t hope to have a career unless they had papers in Nature and Science, but then the conversation took an interesting turn. “Anyhow,” said eminent Professor X. “One of my postdocs had a really bad experience with a PLOS journal.” Everyone was agog. Nothing better at conference drinks than a new twist on […]

Enabling the Effective Sharing of Clinical Data

This blog post was written for DNAdigest by Mathias Astell, Marketing Manager for Nature Publishing Group & Iain Hrynaszkiewicz Head of Data and HSS Publishing for Nature Publishing Group The benefits of sharing data generated by researchers have long been understood to be of great value to science (as exemplified by this British Medical Journal piece from 1994). And over recent years there has been a rapid increase in the ability to share and access research data – as can be seen in the rise of data journals (such as Scientific Data and Gigascience), the increase in research data repositories (both general and subject-specific), and the establishment of data sharing policies around the world. However, in the medical world large amounts of clinical research can go unpublished and a large number of clinical trials go unregistered (almost 40% according to one study) – meaning we only have a partial account of what data have been gathered in medical research, let alone data that may be available to others. On top of this problem of non-publication, there is also evidence that reporting of research in medical literature favours positive results (as can be seen in this study and this one). All of which […]

Highlights of 2015: Blog Posts & Publications

2015 was a great year for DNAdigest! We organised more events, welcomed more volunteers to the team and increased our output of online communications and blog generation! It has been a joy to watch our followers and online community grow with us and as we approach the end of the year, we want to dedicate this blog post to looking back over 2015 and celebrate the achievements we made together!   Blog Posts: April 8th – ‘Genomic Data Sharing – Ethical and Scientific Imperative’ We chose this guest post by Mahsa Shabani because it was one of the most popular blog posts from 2015. Here Mahsa discusses how sharing data via controlled-access databases has been seen as an answer to the identified privacy and legal complications of sharing data. While the structure, membership and procedure of access review varies across DAC’s, Mahsa warns that such access review mechanisms have rarely received attention. By establishing adequate oversight mechanisms on data sharing, progressive and reposonsible data use will be on the horizon. Read more . . . July 8th – ‘The Sharers’ Leaderboard: an h-index of data sharing’ We chose this guest post by Kate Hodesdon from Seven Bridges Genomics because it discusses the possibly of codifying best practices of […]

Highlights of 2015: Interviews

2015 was a great year for DNAdigest! We organised more events, welcomed more volunteers to the team and increased our output of online communications and blog generation! It has been a joy to watch our followers and online community grow with us and as we approach Christmas and the end of the year, we want to dedicate this blog post to looking back over 2015 and celebrate the achievements we made together!   Interviews: March – DNAdigest Interviews Genomic Medicine Alliance (Part 1 and 2) We included this in our 2015 overview because our Genomic Medicine Alliance (GMA) interview was the only one that spread over 2 parts and formed the most in-depth interview of the year. In part 1, we caught up with Professor George P. Patrinos, a member of the Scientific Advisory Committee for the GMA and he explains what GMA is, how to join and the benefits of doing so. In part 2, George discusses his role and explains the 7 different working groups within the GMA; Genomic Informatics, Pharmacogenomics, Cancer Genomics, Rare Diseases and Drug Outcomes, Public Health Genomics, Genethics and Economic Evaluation in Genomic Medicine. Read more . . . April – DNAdigest interviews GA4GH We included this interview into the 2015 […]

Highlights of 2015 – Events

2015 was a great year for DNAdigest! We organised more events, welcomed more volunteers to the team and increased our output of online communications and blog generation! It has been a joy to watch our followers and online community grow with us and as we approach Christmas and the end of the year, we want to dedicate this blog post to looking back over 2015 and celebrate the achievements we made together!   Events: February 28th – DNAdigest Hackday Only 2 months into 2015 and we hosted our first hackday. We invited our followers and the local scientific community to a day of brainstorming, ideation and hacking for the benefit of genetics research. Held in the our new office at the Future Business Centre in Cambridge, we welcomed and encouraged geneticists, bioinformaticians, software developers and anyone with a interest in public genomic datasets to join forces. Together we addressed how to make a ‘recommendation service’ that will recommend datasets that a person may find interesting based on their dataset access history and how to make an automated alert system that notifies a user when a new dataset is added or made available. Read more . . . August 21st – DNAdigest Symposium In August we […]

DNAdigest Interviews EMC

EMC is a global tech organisation renowned for storage and management of big data. With VMware and Pivotal, EMC has moved from storage to virtualisation to app development. And for the last 5 years EMC has developed a new business vertical focused on life sciences. John Gurnett is a member of the global life sciences group at EMC where he works closely with EMC customers to understand how to make EMC products applicable across healthcare and life sciences. In his daily work John works with policy makers from hospitals, clinicians, CEOs from partnering companies, financial controllers and researchers across the board addressing challenges in research and healthcare. One of the main questions that John works on currently is how the DNA sequencing technology can be utilised in a clinical environment. The full impact of the technology will not be realised until its usage is made mainstream. And one of the hurdles to progress is the lack of ‘digitisation’ of the healthcare system. JG: “From this point you cannot do healthcare without IT – We need to digitise everything” The perennial challenge/opportunity is trapping, storing and using data. The countries outside of the US still need to shift to Electronic Health Records (EHR) […]

DNAdigest Surveys the BioData World Congress Attendees

On 21-22 October DNAdigest attended the BioData World Congress 2015 held at the Wellcome Trust Genome Campus in Hinxton. Founder and CEO Fiona Nielsen not only attended the majority of the talks over the two day event, but also took part in the Open Innovation panel and played the role of Amy Friedman in the ‘Genomics in Play’ drama. Additionally, DNAdigest had a stand in a prime location at the event, which was manned by volunteers Craig Smith and Charlotte Whicher. If you’re interested in what people were talking about on Twitter at the conference – you can read the DNAdigest BioData World Congress Storify complete with pictures of presentations from guest speakers and various quotes from talks. Thanks to some strategically placed sweets, we were able to talk to lots of the attendees as well as some of the guest speakers including Dr Robert Green (Harvard Medical School), Dr Bob Rogers (Intel Corporation) and Dr Niklas Blomberg (ELIXIR). During the event, we conducted a short online survey on the genomic data searching / accessing / sharing habits of the attendees and speakers. In exchange for completing the survey we gave away DNAdigest Mugs and T-shirts to 6 lucky winners. Our survey consisted of 3 simple multiple choice questions: The results speak for […]

The key elements of good data sharing practice

This is a guest post by Wellcome Trust. Originally published on The Wellcome Trust is a leading partner in the Public Health Research Data Forum, which brings together research funders who are committed to increasing the sharing of health research data in ways that are equitable, ethical and efficient and will accelerate improvements in public health. On behalf of the Forum, the Trust funded a major international study of stakeholders’ views about best practices for sharing public health research data from low and middle income settings, which recently published its results. Dr Susan Bull and Prof Michael Parker, from The Ethox Centre, University of Oxford, discuss the key issues and findings of the study. Data-sharing is increasingly seen as an important component of effective and efficient biomedical research – both by researchers, and research funders. At the same time, it is recognised that efforts to increase access to individual-level data raise important ethical and governance challenges, some of which may vary depending on the context in which the research takes place. The primary argument in favour of more routine sharing of de-identified research data is its potential to generate more – and higher quality – science. This could in turn lead to improved health outcomes, and promoting […]

Information management: to federate or not to federate

This is a guest post by Yasmin Alam-Faruque, member of Eagle Genomics’ Biocuration team. Originally published on Information management is a key organisational activity that concerns the acquisition, organisation, cataloguing and structuring of information from multiple sources and its distribution to those who need it. From a scientist’s perspective, experimental results are the most important pieces of information that are analysed and interpreted to make new biological discoveries. Unless you are the one generating the results, it is not always an easy task to find and gather all other relevant datasets and documents that you need for further comparison and analyses. What is the current approach? Currently, sharing of data between researchers is a manual and complex process, which causes inefficiency since a significant fraction of researcher time is spent on this activity. New high-throughput technologies generating huge datasets are compounding the problem. We argue that new information management approaches based on data federation can help address this problem, thus leading to quicker analyses and discovery of new biological insights. Data federation is a form of data consolidation, whereby data is collected from distinct databases without ever copying or transferring the original data itself. It combines result sets from across multiple source systems and […]

How research data sharing can save lives

This is a guest post by Trish Groves, head of research at The BMJ.  Originally published on website. Everyone’s been missing a trick. The whole debate on sharing clinical study data has focused on transparency, reproducibility, and completing the evidence base for treatments. Yet public health emergencies such as the Ebola and MERS outbreaks provide a vitally important reason for sharing study data, usually before publication or even before submission to a journal, and ideally in a public repository. Not just from randomised controlled trials, but from case series and samples, lab testing studies, surveillance studies, viral sequencing, genomic work, and other epidemiological observational studies too. During the Ebola crisis, researchers couldn’t or wouldn’t share data. Last week WHO held a consultation meeting in Geneva to tackle this. One big reason for withholding data was the mostly unfounded fear of having subsequent papers rejected by journals. But researchers capturing vital information in the field and in coordinating centres were too busy to write and submit those papers, and thus much time was lost before vital information could be disseminated. Did people die because of the Ingelfinger rule against prior publication? There were also, of course, some commercial disincentives to early data sharing, with […]

DNAdigest interviews Intel

Big Data Solutions is the leading big data initiative of Intel that aims to empower business with the tools, technologies, software and hardware for managing big data. Big Data solutions is at the forefront of big data analytics and today we talk to Bob Rogers, Chief Data Scientist, about his role, big data for genomics and his contributions to the BioData World Congress 2015. 1.What is your background and your current role? Chief Data Scientist for Big Data Solutions. My mission is to put powerful analytics tools in the hands of every business decision maker. My responsibility is to ensure that Intel is leading in big data analytics in the areas of empowerment, efficiency, education and technology roadmap. I help customers ask the right questions to ensure that they are successful with their big data analytics initiatives. I began with a PhD in physics. During my postdoc, I got interested in artificial neural networks, which are systems that compute the way the brain computes. I co-wrote a book on time series forecasting using artifical neural networks that resulted in a number of people asking me if I could forecast the stock market. I ended up forming a quantitative futures fund with three other […]

Blockchain and Digital Health – First Impressions

Guest Post by Rodrigo Barnes, Chief Technology Officer at Aridhia. This blog post was originally published on the Aridhia website on 25 August 2015. The blog post was inspired by the Ethereum Workshop at the Turing Festival in Edinburgh. Among the many great Edinburgh festivals, the Turing Festival is the most important to the tech start-up scene locally and beyond. This weekend, I attended the Ethereum Workshop to learn about a type of “blockchain” technology and to think about how it might facilitate innovation in digital health. There’s even interest in this for genomic data sharing, as the Global Alliance and Kaiser Permanente’s John Mattison has suggested. Most people in tech have heard of Bitcoin, the cryptocurrency that is exciting libertarians and central bankers alike. One thing I learned this weekend is that, at its heart, Bitcoin and related technologies can be seen as essentially ‘open ledgers’ where transactions are recorded in a very public way, and can’t be repudiated. The gist of this is that the open ledger can be trusted, even though because of the way it is implemented, there is no central authority vouching for it. The system of maintaining the ledger is the decentralised processing of the blockchain. The question I asked myself is “how could this be applied to digital […]

DNAdigest interviews Biopeer

Biopeer is a data sharing tool for small- to medium-scale collaborative sequencing efforts and begun its journey from a group of senior students from Bilkent University, Turkey. Today, DNAdigest interviews Can Alkan, an Assistant Professor in the Department of Computer Engineering at the Bilkent University and one of the minds behind Biopeer. 1. Please introduce yourself; what is your background, position? I am an Assistant Professor in the Department of Computer Engineering at the Bilkent University, Ankara, Turkey. I’m a computer scientist by training, I finished my PhD at Case Western Reserve University, where I worked on algorithms on the analysis of centromere evolution, and then RNA folding and RNA-RNA interactions. Later, I did a lengthy postdoc at the Genome Sciences Department of the University of Washington. I was lucky during my postdoc, that the next generation sequencing started a few months after I joined UW, and suddenly I found myself in many large scale sequencing projects such as the 1000 Genomes Project. Since NGS was entirely new, we needed to develop many novel algorithms to analyze the data. Together with my colleagues I developed read mappers (mrFAST/mrsFAST) specifically for segmental duplication analysis, which we used to generate the first personalized segmental duplication and copy number polymorphism […]