Data Sharing Game

As we all know the reuse of research data definitely benefits the scientific community as a whole, but the decision whether to archive and share these data or not depend primarily on individual researchers. For individuals, it is less obvious that the advantages of sharing data outweigh the associated costs, i.e. time and money. In this sense, the problem of data sharing is like a typical game in interactive decision theory, more commonly known as game theory.

By definition, game theory is a study of mathematical models of conflict and cooperation between intelligent rational decision-makers. An obvious assumption herein is that an individual will always try to maximize his or her gains relative to the gains of others.

In the paper “A Research Data Sharing Game” Pronk et al create a framework in order to investigate the community gains versus the advantages of the individual researcher in the competitive world of scientific research.  For the analysis, they have designed a simple model of a scientific community where researchers publish a certain amount of papers in a given year and have the choice either to share or not. Via this model, the effect of sharing policies, exploration of several cost scenarios, and evaluation of the overall benefits to the scientific community relative to the benefits of the individual researcher were reproduced. As a result, it was found that the scientific community can benefit from top-down policies to enhance sharing data even when the act of sharing itself implies a cost. Namely, if (almost) everyone shares, many individuals can gain a higher efficiency as datasets can be reused. Additionally, measures to ensure better data retrieval and quality can compensate for sharing costs by enabling reuse. Nevertheless, an individual researcher who decides not to share omits the costs of sharing.

Data Sharing Game

While increasing benefits with sharing will have the most positive influence on the efficiency of both the individual researcher and the scientific community, this research study showed that in the case of moderate costs, sharing research data can still lead to a generally higher scientific community efficiency as a result of efficient data reuse. A very interesting result is that although for the individual researcher not sharing is beneficial compared to sharing, not sharing can lead to a lower efficiency for all researchers in the community if more than a certain ratio of all researchers adhere to this strategy. However, policies should be able to increase the rate of sharing, and be made in such way that the discoverability of data and its quality would at least a bit compensate the costs.

A better solution would be lowering the costs for sharing, or even turn them into a benefit!