Searching for consistent associations with a multi-environment knockoff filter

Download paper

Abstract:

This paper develops a method based on model-X knockoffs to find conditional associations that are consistent across diverse environments, controlling the false discovery rate. The motivation for this problem is that large data sets may contain numerous associations that are statistically significant and yet misleading, as they are induced by confounders or sampling imperfections. However, associations consistently replicated under different conditions may be more interesting. In fact, consistency sometimes provably leads to valid causal inferences even if conditional associations do not. While the proposed method is flexible and can be deployed in a wide range of applications, this paper highlights its relevance to genome-wide association studies, in which consistency across populations with diverse ancestries mitigates confounding due to unmeasured variants. The effectiveness of this approach is demonstrated by simulations and applications to the UK Biobank data.

Code