New Preprint: Introduction to causality in science studies
Sound causal inference is crucial for advancing the study of science. Incorrectly interpreting predictive effects as causal might be ineffective or even detrimental. Many publications in science studies lack appropriate methods to substantiate their causal claims. In this preprint we provide an introduction to structural causal models. Such models allow researchers to make their causal assumptions transparent and provide a foundation for causal inference. We illustrate how to use structural causal models based on simulated data of a hypothetical structural causal model of Open Science. We hope our introduction helps researchers in science studies to consider causality explicitly.
The PathOS context
Concerns of causality are centre stage for the PathOS project. Without a proper understanding of causality, it is impossible to provide proper policy recommendations. For example, imagine we observe that published research using open data is less reproducible. Even if open data does in fact have a positive effect on reproducibility, this negative association might appear if journals select research based on open data and rigour. That is, journals may be more likely to publish research if it has open data, but also if it is more rigorous. If published research has no open data, it therefore tends to be more rigorous, otherwise it would not be published at all. Research that is more rigorous tends to be more reproducible, and this effect might be stronger than the effect of open data. For this reason, the association between open data and reproducibility might be negative, even if the actual causal effect is positive. If we incorrectly interpret the negative association as causal, and then recommend not to incentivise open data, we would be providing ill advice.
What's next?
Having a common understanding of causality and structural causal modelling helps the PathOS project interpret the existing literature and the identification of impact pathways. This requires us to differentiate the impact of open science from the effect of openness on that impact. That is: how does the fact that something is open—be it publication, data, code, review—have a causal effect on its impact? The introduction to causality provides such a common understanding. This will be especially important as PathOS builds upon the knowledge gained through our evidence scoping and intervention logic definition to further map and validate Open Science impact pathways and their verification methods (work on which is well underway – watch this space!).