Research Validation

Reasoning Test

You will be shown 12causal reasoning questions. Each includes the causal graph structure and probability tables needed to compute the answer. Read the tables carefully — the same data can produce different answers depending on whether you're asked an observational, interventional, or counterfactual question.

Human experts score ~98% on similar tasks. The best LLMs score ~29%.