[BG25]: Causation, Meaning, and Communication #
Formalizes Beller & Gerstenberg (2025) "Causation, Meaning, and Communication," Psychological Review 133(2), 339–381.
A counterfactual simulation model of causal language combining:
- Causal knowledge module: computes whether-causation (W), how-causation (H), and sufficient-causation (S) from counterfactual simulations
- Semantics module: defines four causal expressions as logical combinations of W, H, S (Eqs. 4–7)
- Pragmatics module: RSA inference selects the most informative true expression
Semantics (Eqs. 4–7) #
- affected(A → e) = W ∨ H ∨ S (any causal involvement)
- enabled(A → e) = W ∨ S (necessity or sufficiency)
- caused(A → e) = H ∧ (W ∨ S) ∧_σ M ∧_σ U (how-cause + counterfactual, softened by movement M and uniqueness U; σ controls softening)
- made no difference(A → e) = ¬W ∧ ¬_ν H ∧ ¬S (no causal role, with soft negation ¬_ν of how-cause; ν controls softening)
The expressions form a hierarchy of specificity: caused ⊂ enabled ⊂ affected. This hierarchy drives scalar implicatures via RSA pragmatic reasoning.
Simplification #
This formalization uses Boolean W, H, S (matching Table 1's illustrative scenarios where aspect values are 0 or 1). The full paper computes W and S as graded probabilities from noisy counterfactual simulations (Eqs. 1, 3).
The core Boolean semantics omit M (movement) and U (uniqueness), which only affect "caused" via soft conjunction. These features are process constraints that reduce the probability of "caused" when the candidate cause was stationary (M=false) or not the unique contactor (U=false). For the illustrative Table 1 scenarios, M=true and U=true throughout.
Experiments #
- Experiment 1A: Norming study (N=50). 15/20 sentence frames had median acceptability ≥ 4 (scale midpoint) for all three causal expressions.
- Experiment 1B: Semantic relations (N=53). Contradictory orderings (e.g., "caused but did not enable") rated less acceptable than consistent orderings (e.g., "affected but did not cause"). Confirms overlapping (not inconsistent) semantics for "caused" and "enabled."
- Experiment 1C: Implicature cancellation (N=51). "in fact" cancellations (e.g., "affected, in fact caused") rated more acceptable than redundant specifications (e.g., "caused, in fact affected").
- Experiment 2: Speaker task (N=62, 30 billiard scenarios). Full model r=0.87, RMSE=0.13. Fitted: θ=1.0, σ=0.65, ν=0.25, λ=40.18.
- Experiment 3: Listener task (N=50, 36 trials). Full model r=0.91, RMSE=0.10.
The four causal expressions studied in the paper (Figure 2, p. 343).
Participants chose among these to describe billiard-ball interactions:
- "Ball A caused Ball B to go through the gate."
- "Ball A enabled Ball B to go through the gate."
- "Ball A affected Ball B's going through the gate."
- "Ball A made no difference to Ball B's going through the gate."
- caused : CausalExpression
- enabled : CausalExpression
- affected : CausalExpression
- madeNoDifference : CausalExpression
Instances For
Equations
- BellerGerstenberg2025.instDecidableEqCausalExpression x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Core causal aspects from the counterfactual simulation model (CSM).
- W (whether-causation): P(e' ≠ e | s, remove(A)). Was the cause counterfactually necessary? (Eq. 1)
- H (how-causation): P(Δe' ≠ Δe | s, change(A)). Did the cause affect the fine-grained outcome? Binary. (Eq. 2)
- S (sufficient-causation): P(W(A → e) | s, remove(¬A)). Would the cause have been a whether-cause if alternatives were removed? (Eq. 3)
- whether : Bool
- how : Bool
- sufficient : Bool
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Boolean semantics of causal expressions (Eqs. 4–7, pp. 346–347).
This is the core of the semantics, omitting the soft constraints (σ for M/U in "caused", ν for H in "made no difference").
- affected: W ∨ H ∨ S — any causal involvement (Eq. 4)
- enabled: W ∨ S — necessity or sufficiency, not just how (Eq. 5)
- caused: H ∧ (W ∨ S) — how-cause plus counterfactual (core of Eq. 6)
- madeNoDifference: ¬W ∧ ¬H ∧ ¬S — no causal involvement (hard version of Eq. 7)
Equations
- BellerGerstenberg2025.expressionMeaning cw BellerGerstenberg2025.CausalExpression.affected = (cw.whether || cw.how || cw.sufficient)
- BellerGerstenberg2025.expressionMeaning cw BellerGerstenberg2025.CausalExpression.enabled = (cw.whether || cw.sufficient)
- BellerGerstenberg2025.expressionMeaning cw BellerGerstenberg2025.CausalExpression.caused = (cw.how && (cw.whether || cw.sufficient))
- BellerGerstenberg2025.expressionMeaning cw BellerGerstenberg2025.CausalExpression.madeNoDifference = (!cw.whether && !cw.how && !cw.sufficient)
Instances For
"caused" implies "enabled": H ∧ (W ∨ S) → W ∨ S.
"Caused" is the most specific expression (p. 349).
"enabled" implies "affected": W ∨ S → W ∨ H ∨ S.
Full scalar chain: caused → enabled → affected.
"madeNoDifference" is the negation of "affected" (Boolean version).
Graded semantics with soft negation parameter ν (Eq. 7, p. 347).
The full paper uses ν to soften the negation of H in "made no difference": when H=true but W=false and S=false (Ball A is a how-cause only), "made no difference" gets semantic value ν instead of 0.
For "caused," the full model also softens M (movement) and U (uniqueness) via parameter σ (Eq. 6). Since the illustrative Table 1 scenarios all have M=true and U=true, we omit these here — the core H ∧ (W ∨ S) suffices.
Equations
- One or more equations did not get rendered due to their size.
- BellerGerstenberg2025.gradedMeaning cw ν BellerGerstenberg2025.CausalExpression.affected = if (cw.whether || cw.how || cw.sufficient) = true then 1 else 0
- BellerGerstenberg2025.gradedMeaning cw ν BellerGerstenberg2025.CausalExpression.enabled = if (cw.whether || cw.sufficient) = true then 1 else 0
- BellerGerstenberg2025.gradedMeaning cw ν BellerGerstenberg2025.CausalExpression.caused = if (cw.how && (cw.whether || cw.sufficient)) = true then 1 else 0
Instances For
Scenario 1: Classic Michottean launch. Ball A collides with stationary Ball B, launching it through the gate. W=1, H=1, S=1.
Equations
- BellerGerstenberg2025.scenario1 = { whether := true, how := true, sufficient := true }
Instances For
Scenario 2: Ball A knocks a box out of Ball B's path. Ball B was already heading toward the gate. W=1, H=0, S=1 (double prevention).
Equations
- BellerGerstenberg2025.scenario2 = { whether := true, how := false, sufficient := true }
Instances For
Scenario 3: Ball B is heading toward the gate on its own. Ball A comes up from behind and pushes it along. W=0, H=1, S=0 (how-cause only).
Equations
- BellerGerstenberg2025.scenario3 = { whether := false, how := true, sufficient := false }
Instances For
Scenario 4: Ball A does not interact with Ball B at all. No causal involvement. W=0, H=0, S=0.
Equations
- BellerGerstenberg2025.scenario4 = { whether := false, how := false, sufficient := false }
Instances For
Illustrative ν parameter for Table 1 (p. 348: "set to 0.2").
Equations
- BellerGerstenberg2025.ν_table1 = 1 / 5
Instances For
In an H-only world (Scenario 3), "made no difference" has non-zero semantic value ν=1/5 due to soft negation (Eq. 7). This is the key distinction from the hard Boolean semantics.
[BG25]'s pragmatics module as a mathlib-PMF Frank-Goodman model [FG12]: the literal listener is uniform on each expression's extension over the 8 causal worlds (Table 1c), the pragmatic speaker normalises literal informativity over expressions (α = 1, no cost; Table 1d), and the pragmatic listener is the Bayesian posterior against the uniform world prior. The fitted model (Experiment 2) uses λ = 40.18, but α = 1 suffices for the qualitative predictions.
Boolean meaning in speaker argument order (expression first).
Equations
Instances For
Literal listener L0(·|u): uniform on the expression's extension.
Equations
Instances For
Pragmatic speaker S1(·|cw) ∝ L0(cw|·) (α = 1, zero cost; Table 1d).
Equations
- BellerGerstenberg2025.S1 cw = RSA.S1Belief BellerGerstenberg2025.L0 (fun (x : BellerGerstenberg2025.CausalExpression) => 1) 1 cw ⋯ ⋯
Instances For
Uniform prior over the 8 causal worlds.
Equations
- BellerGerstenberg2025.worldPrior = PMF.uniformOfFintype BellerGerstenberg2025.CausalWorld
Instances For
Pragmatic listener L1(·|u): Bayesian posterior of S1 against the
uniform world prior.
Equations
Instances For
Same-world expression preference reduces to comparing literal-listener mass — the speaker's partition function cancels.
≤ companion of S1_lt_iff.
S1 speaker predictions from the full 8-world Boolean model #
These theorems verify that the PMF-face model reproduces the qualitative predictions from Table 1d. The predictions arise from the full space of 2³ = 8 causal worlds with uniform prior and Boolean semantics: extension sizes are caused 3, enabled 6, affected 7, madeNoDifference 1.
In each scenario, the pragmatic speaker selects the most informative true expression, producing the same preference orderings as Table 1d.
Scenario 1 (full causation): S1 prefers "caused" over "enabled."
In (W=1, H=1, S=1), all positive expressions are literally true. "caused" applies to only 3/8 worlds while "enabled" applies to 6/8, so L0("caused") is more informative → S1 selects "caused."
Scenario 1: S1 prefers "enabled" over "affected."
Scenario 2 (double prevention): S1 prefers "enabled" over "affected."
In (W=1, H=0, S=1), "caused" is literally false (H=0), so the speaker chooses between "enabled" and "affected." "enabled" is more informative (6/8 vs 7/8 worlds), so S1 selects it.
Scenario 2: "caused" is ruled out (literally false at H=0).
Scenario 3 (H-only): S1 prefers "affected" over "caused."
In (W=0, H=1, S=0), "affected" is the only true positive expression. "caused" requires H ∧ (W ∨ S), which fails when W=S=0.
Scenario 3: "affected" also beats "madeNoDifference."
In the Boolean model (unlike the graded model with ν), "madeNoDifference" is strictly false in an H-only world: ¬W ∧ ¬H ∧ ¬S fails because H=1.
Scenario 4 (no causation): S1 prefers "madeNoDifference" over "affected."
In (W=0, H=0, S=0), only "madeNoDifference" is literally true.
L1 listener predictions: scalar implicature effects #
The pragmatic listener (L1) inverts S1 via Bayes' rule. Hearing a weaker expression triggers a scalar implicature: the listener infers the speaker chose not to use a stronger expression, shifting probability away from worlds where the stronger expression would have been true.
Each comparison cancels the shared marginal and uniform prior
(PMF.posterior_lt_iff_kernel_lt_of_uniform), reducing to a cross-world
S1 comparison: a vacuous zero where the expression is false, or a
partition-dominance where the literal weights agree.
L1 hearing "caused": higher probability for full-causation world than no-causation world.
L1("caused") assigns positive mass only to worlds where "caused" is literally true (H ∧ (W ∨ S)). The no-causation world (F,F,F) gets zero.
Speaker partition functions at the two "enabled"-worlds: at (T,F,F) only
"enabled" and "affected" are true (Z = 6⁻¹ + 7⁻¹); at (T,T,T) "caused" also
competes (Z = 3⁻¹ + 6⁻¹ + 7⁻¹). The smaller partition means a sharper
speaker, so hearing "enabled" favours (T,F,F).
L1 scalar implicature for "enabled": hearing "enabled" makes the listener prefer worlds where "caused" is false over worlds where it's true.
Both (T,F,F) and (T,T,T) make "enabled" literally true (W ∨ S). But at (T,T,T), S1 would have said "caused" instead (more informative), so L1 down-weights (T,T,T) upon hearing "enabled." This is the classic scalar implicature: "enabled" ⇝ ¬caused.
L1 hearing "madeNoDifference": identifies the no-causation world.
"madeNoDifference" is true only at (F,F,F), so L1 assigns it probability 1.
Canonical test worlds
Equations
- BellerGerstenberg2025.world_W_only = { whether := true, how := false, sufficient := false }
Instances For
Equations
- BellerGerstenberg2025.world_S_only = { whether := false, how := false, sufficient := true }
Instances For
Equations
- BellerGerstenberg2025.world_H_only = { whether := false, how := true, sufficient := false }
Instances For
Equations
- BellerGerstenberg2025.world_full = { whether := true, how := true, sufficient := true }
Instances For
Equations
- BellerGerstenberg2025.world_none = { whether := false, how := false, sufficient := false }
Instances For
Equations
- BellerGerstenberg2025.world_caused = { whether := true, how := true, sufficient := false }
Instances For
In W-only world, "enabled" is true but "caused" is false. Speaker would use "enabled" (Scenario 2-like).
In H-only world, only "affected" applies among the positive expressions.
In full world, all positive expressions apply.
In none world, only "madeNoDifference" applies.
Fitted model parameters from Experiment 2 (p. 358, MLE).
θ=1.0 (counterfactual noise), σ=0.65 (softening for M/U in "caused"), ν=0.25 (softening for H in "made no difference"), λ=40.18 (RSA speaker optimality).
- θ : ℚ
- σ : ℚ
- ν : ℚ
- alpha : ℚ
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- BellerGerstenberg2025.exp2_params = { θ := 1, σ := 65 / 100, ν := 1 / 4, alpha := 4018 / 100 }
Instances For
Experiment 2 cross-validation (Table 8, p. 362).
- model : String
- r_median : ℚ
- rmse_median : ℚ
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- BellerGerstenberg2025.exp2_fullModel = { model := "Full model", r_median := 87 / 100, rmse_median := 13 / 100 }
Instances For
Equations
- BellerGerstenberg2025.exp2_noPragmatics = { model := "No pragmatics", r_median := 74 / 100, rmse_median := 18 / 100 }
Instances For
Equations
- BellerGerstenberg2025.exp2_noSemNoPrag = { model := "No sem & no prag", r_median := 53 / 100, rmse_median := 25 / 100 }
Instances For
Full model outperforms alternatives in Experiment 2.
Experiment 3 cross-validation (Table 9, p. 366).
Equations
- BellerGerstenberg2025.exp3_fullModel = { model := "Full model", r_median := 91 / 100, rmse_median := 10 / 100 }
Instances For
Equations
- BellerGerstenberg2025.exp3_noPragmatics = { model := "No pragmatics", r_median := 83 / 100, rmse_median := 13 / 100 }
Instances For
Full model outperforms in Experiment 3 (listener task).
Bridge to Causation #
Beller & Gerstenberg's W, H, S dimensions can be COMPUTED from structural
causal models, grounding the primitive Boolean features in the counterfactual
reasoning machinery of Causation.
| B&G aspect | Structural definition |
|---|---|
| W (whether) | causallyNecessary — would effect still occur without cause? |
| H (how) | hasDirectLaw — does a causal law directly connect cause to effect? |
| S (sufficient) | causallySufficient — does adding cause guarantee effect? |
The mapping of H to hasDirectLaw is an approximation. B&G's how-causation
(Eq. 2) tests whether the fine-grained outcome would differ under a small
perturbation of the cause, which is a richer notion than structural directness.
For simple causal models, the two coincide.
Equations
- BellerGerstenberg2025.instDecidableEqBGVar x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- BellerGerstenberg2025.instReprBGVar = { reprPrec := BellerGerstenberg2025.instReprBGVar.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Depth function for BGVar: roots at 0, intermediate at 1, effect at 2.
Strictly increases along every parent edge in all three BG2025 graphs
(soloGraph, overdetGraph, chainGraph), so IsDAG.of_depth
discharges acyclicity uniformly.
Equations
Instances For
Compute a CausalWorld from a V2 BoolSEM. W = causally necessary,
H = direct law in graph, S = causally sufficient.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The causal expression Horn scale: ⟨affected, enabled, caused⟩.
Ordered from weakest (true in most scenarios) to strongest (true in fewest).
"madeNoDifference" is excluded: it is the Boolean complement of "affected"
(proved in madeNoDifference_iff_not_affected), not a weaker scalar
alternative.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Using "affected" implicates ¬enabled ∧ ¬caused (Experiment 1C).
Using "enabled" implicates ¬caused.
Scale ordering is grounded in semantic entailment: stronger expressions entail weaker ones at every world. This is the defining property of a Horn scale — the scale structure isn't stipulated, it's derived from the entailment relations proved in Section 4.
Structural Model → Causation Type → Expression → S1 #
Each theorem below chains across three levels of analysis:
- Structural: A
BoolSEM(structural equation model) yields directness and necessity viahasDirectLaw/causallyNecessary - Causation type: These determine production vs dependence
causation via
causationType(fromProductionDependence.lean) - Semantic + Pragmatic: The derived
CausalWorlddetermines which expressions are literally true, and the S1 pragmatic speaker selects the most informative one
These are the deepest integration points: a change to normalDevelopment,
expressionMeaning, or the L0/S1 computation would break these theorems.
End-to-end pipeline theorems linking V2 BoolSEM models to RSA S1
predictions. causationType (from ProductionDependence.lean)
takes plain Bools, so its inputs are computed with the SEM
predicates and passed through.
The solo model computes the full-profile world ⟨W,H,S⟩ = ⟨1,1,1⟩:
Def 10b necessity and bare sufficiency proven through the fuel
bridge, then converted into the decide-wrapped Bools.
Solo cause → S1 prefers "caused". From a V2 model with one direct law (cause → effect), the pipeline produces the correct pragmatic prediction.
Overall Acceptance Rates (Figure 2) #
Proportion of "Yes" (Accurate) responses by verb. Ordering: caused > made > forced.
Rates stored as percentages (Nat) to avoid heavy Mathlib imports.
An experimental observation: verb form paired with acceptance rate (%).
- verb : String
- ratePct : ℕ
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- BellerGerstenberg2025.causedRate = { verb := "caused", ratePct := 48 }
Instances For
Equations
- BellerGerstenberg2025.madeRate = { verb := "made", ratePct := 40 }
Instances For
Equations
- BellerGerstenberg2025.forcedRate = { verb := "forced", ratePct := 35 }
Instances For
Acceptance ordering: caused > made > forced (Figure 2).
Main Effect Coefficients (Table 8) #
Bayesian logistic regression with verb × SUFresidALT × INT × ALT.
Coefficients stored as (numerator, denominator=100) to avoid ℚ imports.
A regression coefficient with its 95% credible interval. Values are × 100 (e.g., 119 means 1.19).
- name : String
- estimate100 : ℤ
- lowerCI100 : ℤ
- upperCI100 : ℤ
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
A coefficient is reliable when its 95% CI excludes 0.
Equations
- c.isReliable = (decide (c.lowerCI100 > 0) && decide (c.upperCI100 > 0) || decide (c.lowerCI100 < 0) && decide (c.upperCI100 < 0))
Instances For
Equations
- BellerGerstenberg2025.coeff_sufResidAlt = { name := "SUFresidALT", estimate100 := 119, lowerCI100 := 89, upperCI100 := 150 }
Instances For
Equations
- BellerGerstenberg2025.coeff_int = { name := "INT", estimate100 := 54, lowerCI100 := 28, upperCI100 := 81 }
Instances For
Equations
- BellerGerstenberg2025.coeff_alt = { name := "ALT", estimate100 := -82, lowerCI100 := -111, upperCI100 := -55 }
Instances For
SUF has the largest absolute main effect.
ALT has a negative main effect (more alternatives → less acceptable).
Per-Verb Interaction Reliability (Table 9) #
Estimates of interaction intercepts by verb.
Per-verb reliable interaction from Table 9.
- verb : String
- interaction : String
- positive : Bool
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
made uniquely has a reliable SUFresidALT×INT interaction (Table 9).
Equations
- BellerGerstenberg2025.made_sufInt = { verb := "made", interaction := "SUFresidALT:INT", positive := true }
Instances For
All verbs share reliable SUFresidALT×ALT interaction.
Equations
- BellerGerstenberg2025.caused_sufAlt = { verb := "caused", interaction := "SUFresidALT:ALT", positive := true }
Instances For
Equations
- BellerGerstenberg2025.made_sufAlt = { verb := "made", interaction := "SUFresidALT:ALT", positive := true }
Instances For
Equations
- BellerGerstenberg2025.forced_sufAlt = { verb := "forced", interaction := "SUFresidALT:ALT", positive := true }
Instances For
All verbs share reliable INT×ALT interaction.
Equations
- BellerGerstenberg2025.caused_intAlt = { verb := "caused", interaction := "INT:ALT", positive := true }
Instances For
Equations
- BellerGerstenberg2025.made_intAlt = { verb := "made", interaction := "INT:ALT", positive := true }
Instances For
Equations
- BellerGerstenberg2025.forced_intAlt = { verb := "forced", interaction := "INT:ALT", positive := true }
Instances For
Acceptability Contrasts #
Non-interchangeability of causative verbs across contexts.
Acceptability judgment for a causative verb in context.
- acceptable : Acceptability
- marginal : Acceptability
- unacceptable : Acceptability
Instances For
Equations
- BellerGerstenberg2025.instDecidableEqAcceptability x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
A single acceptability judgment: verb + context + status.
- verb : String
- context : String
- judgment : Acceptability
- source : String
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Low sufficiency, low intention context: only cause is acceptable.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
High sufficiency, high intention, low alternatives: all verbs acceptable.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
In high-SUF, high-INT, low-ALT contexts, all three verbs are acceptable.
In low-SUF contexts, only cause is acceptable.