On the Use of Causal Models to Build Better Datasets