Documentation

Linglib.Studies.GrusdtLassiterFranke2022

[GLF22] #

"Probabilistic modeling of rational communication with conditionals" PLoS ONE 17(7): e0269937.

Overview #

This paper extends RSA to model conditionals by:

  1. Treating "worlds" as probability distributions (WorldState)
  2. Using assertability (P(C|A) ≥ θ) as the literal meaning of conditionals
  3. Having L1 infer both the world state AND the causal structure

The key insight is that the literal meaning of "if A then C" is an assertability condition — P(C|A) ≥ θ — rather than material implication. This grounds RSA's meaning function in conditional probability.

Toy Example (§2.3, Table 2) #

The paper's illustrative example uses 3 states and 4 utterances with θ = 0.9:

P(A)P(C)P(A,C)P(C|A)likely Cif A, CCA and C
s10.90.90.810.9
s20.650.650.612/13
s30.60.60.360.6

Key predictions:

Full Model (§3) #

The full model uses 10,000 sampled world states and three causal relations (A→C, C→A, A⊥C) as latent variables. We formalize only the toy example here, which captures all qualitative predictions with finite types amenable to exact PMF evaluation.

Grounding #

The assertability truth values are grounded in the probabilistic assertability condition from Semantics.Probabilistic.ConditionalAssertability: a conditional "if A then C" is assertable when P(A) > 0 and P(C|A) ≥ θ.

Experiments #

  1. Experiment 1: Dependency Inference — Participants hear "if A then C" and judge causal structure. 72% infer A→C, 15% C→A, 10% A⊥C.

  2. Experiment 2: Conditional Perfection — 85% endorse "if ¬A then ¬C" in A→C contexts vs 45% in independent contexts.

  3. Experiment 3: Assertability Thresholds — θ ≈ 0.88 from model fitting.

Utterances from the toy example (Table 2, §2.3).

The paper uses a richer utterance space in the full model, but the toy example uses four utterances ordered by informativity:

  • "likely C": assertable when P(C) ≥ 0.5 (weakest, true in all toy states)
  • "if A then C": assertable when P(C|A) ≥ 0.9
  • "C": assertable when P(C) ≥ 0.9
  • "A and C": assertable when P(A∧C) ≥ 0.9

"A and C" is false in all three toy states, making it a vacuous alternative that nonetheless affects the pragmatic competition.

Instances For
    @[implicit_reducible]
    Equations
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For
      @[implicit_reducible]
      Equations
      • One or more equations did not get rendered due to their size.

      World states from the toy example (Table 2, §2.3).

      Each state is a probability distribution (pA, pC, pAC) representing a different degree of dependence between A and C:

      • s1: High marginals, P(C|A) = 0.9 = θ (borderline assertable)
      • s2: Moderate marginals, P(C|A) = 12/13 ≈ 0.923 > θ (clearly assertable)
      • s3: Moderate marginals, P(C|A) = 0.6 < θ (not assertable)
      Instances For
        @[implicit_reducible]
        Equations
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For
          @[implicit_reducible]
          Equations
          • One or more equations did not get rendered due to their size.

          WorldState for s1: P(A)=0.9, P(C)=0.9, P(A∧C)=0.81. P(C|A) = 0.81/0.9 = 0.9 = θ.

          Equations
          Instances For

            WorldState for s2: P(A)=0.65, P(C)=0.65, P(A∧C)=0.6. P(C|A) = 0.6/0.65 = 12/13 ≈ 0.923 > θ.

            Equations
            Instances For

              WorldState for s3: P(A)=0.6, P(C)=0.6, P(A∧C)=0.36. P(C|A) = 0.36/0.6 = 0.6 < θ.

              Equations
              Instances For

                Assertability threshold θ = 0.9 from the paper.

                Equations
                Instances For

                  Assertability truth table for the toy example (Table 2).

                  Defines when each utterance is assertable in each state. The paper uses P(C|A) θ (non-strict), while Assertability.assertable uses > θ. For the toy example, s1 has P(C|A) = 0.9 = θ exactly, so assertability under ≥ is true but under > is false. We define the truth table directly to match the paper's non-strict threshold.

                  Truth values from Table 2:

                  • likely C (P(C) ≥ 0.5): s1 ✓, s2 ✓, s3 ✓ (all states have P(C) ≥ 0.6 > 0.5)
                  • if A then C (P(C|A) ≥ 0.9): s1 ✓, s2 ✓, s3 ✗
                  • C (P(C) ≥ 0.9): s1 ✓, s2 ✗, s3 ✗
                  • A and C (P(A∧C) ≥ 0.9): s1 ✗, s2 ✗, s3 ✗
                  Equations
                  Instances For

                    Verify that the assertability truth values are grounded in the actual conditional/marginal probabilities of the WorldStates. These connect the directly-defined truth table to the assertability theory.

                    s1 has P(C|A) = 9/10 ≥ θ.

                    s2 has P(C|A) = 12/13 > θ.

                    s3 has P(C|A) = 3/5 < θ.

                    s1 has P(C) = 9/10 ≥ 0.9 (so "C" is assertable).

                    s2 has P(C) = 13/20 ≥ 0.5 (so "likely C" is assertable).

                    s3 has P(C) = 3/5 ≥ 0.5 (so "likely C" is assertable).

                    s1 has P(A∧C) = 81/100 < 0.9 (so "A and C" is not assertable).

                    The forward conditional is assertable (using strict >) only for s2, not s1, because Assertability.assertable uses strict >. This demonstrates the ≥ vs > mismatch that motivates the direct truth table.

                    The toy example as a mathlib-PMF Frank-Goodman model [FG12]: the literal listener is uniform on each utterance's assertability extension, the pragmatic speaker S1 normalises the literal weights over utterances (α = 1, no cost), and the pragmatic listener L1 is the Bayesian posterior against the uniform world prior. conjAC, assertable in no state, has an empty extension: it carries weight 0 and drops out of the competition rather than receiving an (undefined) literal-listener PMF.

                    @[reducible, inline]

                    Assertability extension of an utterance: the states where it holds.

                    Equations
                    Instances For
                      noncomputable def GrusdtLassiterFranke2022.litWeight (s : State) (u : Utt) :
                      ENNReal

                      Literal-listener weight L0(s | u) at α = 1: uniform on the extension, 1/|ext u| where u is assertable at s, else 0. For the three assertable utterances this coincides with RSA.L0OfBoolMeaning (litWeight_eq_L0OfBoolMeaning); the never-assertable conjAC is weightless.

                      Equations
                      Instances For
                        theorem GrusdtLassiterFranke2022.litWeight_of_true {s : State} {u : Utt} {k : } (h : assertable' u s = true) (hk : (ext u).card = k) :
                        litWeight s u = (↑k)⁻¹

                        Where an utterance is assertable somewhere, litWeight is exactly the canonical literal listener uniform on the extension; conjAC gets 0.

                        Toy-example weights (Table 2): likelyC is assertable in all 3 states, conditional in {s1, s2}, C in {s1}, conjAC in none.

                        noncomputable def GrusdtLassiterFranke2022.S1 (s : State) :
                        PMF Utt

                        Pragmatic speaker S1(· | s) ∝ L0(s | ·) (α = 1, no cost), normalising the literal weights over utterances.

                        Equations
                        Instances For
                          theorem GrusdtLassiterFranke2022.S1_lt_iff (s : State) (u₁ u₂ : Utt) :
                          (S1 s) u₁ < (S1 s) u₂ litWeight s u₁ < litWeight s u₂

                          Same-state utterance preference reduces to comparing literal weights — the speaker's partition function cancels.

                          theorem GrusdtLassiterFranke2022.S1_ne_zero {s : State} {u : Utt} (h : litWeight s u 0) :
                          (S1 s) u 0

                          Uniform world prior over the three states.

                          Equations
                          Instances For
                            noncomputable def GrusdtLassiterFranke2022.L1 (u : Utt) (h : PMF.marginal S1 worldPrior u 0) :
                            PMF State

                            Pragmatic listener L1(· | u): the Bayesian posterior of S1 against the uniform world prior.

                            Equations
                            Instances For

                              Partition functions Z(s) = ∑_u L0(s | u) per state (s1: 11/6, s2: 5/6, s3: 1/3). A smaller partition means a sharper speaker, so L1 prefers the world with the smaller partition when the numerators agree.

                              S1 predictions from the toy example (Table 2) #

                              The pragmatic speaker in each state prefers the most informative true utterance. Informativity is measured by L0's posterior concentration: utterances that are true in fewer states are more informative.

                              The ordering follows informativity: utterances true in fewer states give L0 a sharper posterior, yielding higher S1 scores.

                              s1: S1 prefers "C" over "if A then C."

                              In s1, both "C" and "conditional" are true, but "C" is true only in s1 while "conditional" is true in both s1 and s2. So "C" is more informative. S1(C|s1) = 6/11, S1(conditional|s1) = 3/11.

                              s1: S1 prefers "if A then C" over "likely C."

                              "conditional" is true in 2 states vs "likely C" in all 3. S1(conditional|s1) = 3/11, S1(likelyC|s1) = 2/11.

                              s2: S1 prefers "if A then C" over "likely C."

                              In s2, "conditional" is true in 2 states while "likely C" is true in all 3. S1(conditional|s2) = 3/5, S1(likelyC|s2) = 2/5.

                              s2: S1 prefers "if A then C" over "C."

                              "C" is false in s2 (P(C) = 0.65 < 0.9), so S1 assigns it zero.

                              s3: "likely C" dominates — no other utterance beats it.

                              "likely C" is the only true utterance in s3. The conditional, C, and conjAC are all false, so they get zero S1 score.

                              L1 predictions: the core dependency inference result #

                              The central prediction: hearing "if A then C" makes the pragmatic listener infer s2 (moderate dependence) over s1 (high marginals). This is because S1 in s1 would have used the more informative "C" instead of the conditional.

                              This is the key mechanism behind dependency inference: conditionals signal that the speaker could not have used a stronger utterance, implicating a state where only the conditional is assertable.

                              Each L1 comparison cancels the shared marginal and uniform prior (PMF.posterior_lt_iff_kernel_lt_of_uniform), reducing to an S1 comparison across states: a vacuous-zero at the world where the utterance is false, or a partition-dominance where the numerators agree.

                              L1 hearing "if A then C": prefers s2 over s1.

                              The core dependency inference result. S1 in s1 would prefer "C" over "conditional" (by s1_C_gt_conditional), so hearing "conditional" makes L1 shift probability toward s2 where "conditional" is the best available utterance.

                              L1 hearing "if A then C": prefers s2 over s3.

                              s3 makes the conditional literally false, so it gets zero L1 weight.

                              L1 hearing "C": identifies s1.

                              "C" is true only in s1, so L1 assigns it probability 1.

                              L1 hearing "likely C": prefers s3 over s1.

                              "likely C" is true in all states, but S1 in s1 prefers "C" and S1 in s2 prefers "conditional," so hearing "likely C" implicates that stronger utterances were unavailable — i.e., the state is s3 where "likely C" is the only option. L1(s3|likelyC) = 55/87 > L1(s1|likelyC) = 10/87.

                              The conditional's meaning in the RSA model equals the assertability condition: assertable iff P(C|A) ≥ θ. Since we use the direct truth table, we verify consistency with the WorldState probabilities.

                              theorem GrusdtLassiterFranke2022.C_assertable_iff_high_pC :
                              (assertable' Utt.C State.s1 = true ws1.pC 9 / 10) (assertable' Utt.C State.s2 = false ws2.pC < 9 / 10) assertable' Utt.C State.s3 = false ws3.pC < 9 / 10

                              "C" is assertable iff P(C) ≥ 0.9.

                              "likely C" is assertable in all states (P(C) ≥ 0.5 everywhere).

                              "A and C" is never assertable (P(A∧C) < 0.9 in all states).

                              Connection to causal inference #

                              The toy example does not include causal relations as a latent variable (the full model in §3 does). However, the key qualitative prediction — that conditionals signal dependency between A and C — is captured by the L1 inference: hearing "if A then C" makes the listener prefer s2 (where A and C are strongly correlated: P(C|A) = 12/13 ≈ 0.923) over s1 (where A and C have high marginals but weaker per-unit correlation).

                              The full model adds CausalRelation (A→C, C→A, A⊥C) as a latent variable, but the dependency inference result already emerges from the simpler model via scalar implicature.

                              Causal asymmetry detection from assertability patterns.

                              If the forward conditional "if A then C" is assertable but the reverse "if C then A" is not, inferCausalRelation returns .ACausesC.

                              Conditional perfection is NOT a semantic entailment.

                              There exist world states where "if A then C" is assertable but the converse "if ¬A then ¬C" need not be. This supports the paper's claim that conditional perfection is a pragmatic implicature, not entailment.

                              Experiment 1 result: causal structure inference from conditionals.

                              Participants (N≈150) hear "if A then C" and judge causal structure. The asymmetry between forward and reverse conditionals is the key finding.

                              • utterance : String
                              • pACausesC :
                              • pCCausesA :
                              • pIndependent :
                              Instances For
                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For
                                  Equations
                                  Instances For
                                    Equations
                                    Instances For

                                      Fitted assertability threshold from Experiment 3.

                                      Equations
                                      Instances For

                                        Experiment 2 perfection rates by causal context.

                                        Instances For
                                          Equations
                                          • One or more equations did not get rendered due to their size.
                                          Instances For

                                            Conditional perfection is modulated by causal context: much higher in A→C contexts than in independent contexts.

                                            Key claims supported by the model #

                                            1. Conditionals communicate dependency: L1 hearing "if A then C" infers a state with high P(C|A) relative to P(C) — i.e., a state where A and C are dependent. This is l1_conditional_prefers_s2.

                                            2. Conditional perfection is pragmatic: The semantic meaning of conditionals (assertability) does NOT entail perfection (perfection_not_semantic). Perfection arises via scalar implicature in the full model.

                                            3. Speaker informativity drives inference: S1 in s1 prefers "C" over "conditional" (s1_C_gt_conditional), so hearing "conditional" implicates that "C" was unavailable (i.e., the state is s2, not s1).

                                            4. Weak states produce weak utterances: In s3, the speaker can only use "likely C" (s3_likelyC_dominates). Hearing "likely C" makes L1 infer s3 (l1_likelyC_prefers_s3), the state with weakest dependence.

                                            The earlier Structural ↔ Probabilistic Bridge section (≈150 LOC) parameterised on the legacy CausalDynamics / Situation / causallySufficient / causallyNecessary substrate was dropped along with that substrate. The qualitative claim — structural sufficiency ∧ necessity ⇒ probabilistic ACausesC — is now witnessed at the V2 SEM level by [NL20]'s Fire/Bus/Lighthouse scenarios (NadathurLauer2020.{Fire,Bus,Lighthouse}): the make/cause divergence cases concretely demonstrate sufficiency without necessity (Bus) and necessity without sufficiency (Fire); the GLF2022 RSA model above uses WorldState directly without going through a structural-causal extractor.