Documentation

Linglib.Studies.Anderson2021

[And21]: conversation update for RSA #

[Luc59] [Sta02] [FG12]

Tell me everything you know (SCiL 2021, 244–253): multi-turn conversation in RSA. The common ground is a probability distribution over worlds substituted for the RSA world prior (Figure 4), updated each turn by convex combination with the pragmatic-listener posterior at a learning rate, with weighted, thresholded, and difference sampling for cooperative observation selection.

Main results #

Implementation notes #

The Figure-4 chain is exact ℚ≥0, parameterized by common-ground weights: each agent (l0Score/s1Score/l1Score/s2Score) is one PMF.normalizeScores application over the agent below it, the distributions are PMF.ofScores, and every prediction closes by the ofScores comparison family with one kernel certificate.

Distributional common ground #

PMF W wraps PMF W: [Sta02]'s context set with graded plausibility summing to one (§3.2), so entropy — Anderson's success criterion — and KL divergence are available on the carrier. toContextSet projects to the positive-mass worlds (PMF.support), PMF.uniformOfFintype is the empty common ground; ofWeights renormalizes non-negative weights (fn. 3). Unlike the classical context set, worlds can regain probability (fn. 7); intersection update survives only at lr = 1.

The common ground as a distribution #

Anderson's distributional common ground is a PMF W ([And21] §3.2): graded plausibility summing to one, with PMF.entropy — the success criterion — and PMF.toRealFn (the ℝ-valued masses every RSA consumer reads) already on the carrier. The classical context set is the support.

@[implicit_reducible]

A distribution projects to [Sta02]'s context set: the positive-mass worlds.

Equations
@[simp]
theorem Discourse.pmf_toContextSet {W : Type u_1} (p : PMF W) (w : W) :

MutualFriends Domain #

Worlds in the MutualFriends domain: four individuals with different feature combinations.

Instances For
    @[implicit_reducible]
    Equations
    def Anderson2021.instReprMFWorld.repr :
    MFWorldStd.Format
    Equations
    Instances For
      @[implicit_reducible]
      Equations
      • One or more equations did not get rendered due to their size.
      theorem Anderson2021.card_mfworld :
      Fintype.card MFWorld = 4
      Instances For
        @[implicit_reducible]
        Equations
        def Anderson2021.instReprMajor.repr :
        MajorStd.Format
        Equations
        Instances For
          @[implicit_reducible]
          Equations
          Instances For
            @[implicit_reducible]
            Equations
            def Anderson2021.instReprLocation.repr :
            LocationStd.Format
            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Utterances available to speakers. Includes a null utterance for passing when the speaker has no confident observation to share.

              Instances For
                @[implicit_reducible]
                Equations
                Equations
                • One or more equations did not get rendered due to their size.
                Instances For
                  @[implicit_reducible]
                  Equations
                  • One or more equations did not get rendered due to their size.

                  Distributional Common Ground (re-exported from substrate) #

                  The PMF substrate above supplies the weights; the conversation-update and RSA content below consume them.

                  CommonGround Update #

                  noncomputable def Anderson2021.updateCG {W : Type u_1} (cg post : PMF W) (lr : ) (hlr : 0 lr) (hlr1 : lr 1) :
                  PMF W

                  Common-ground update ([And21] §6, Figure 2): the lr-mixture of the current common ground with the pragmatic-listener posterior — PMF.mix at the learning rate.

                  Equations
                  Instances For
                    theorem Anderson2021.updateCG_eq {W : Type u_1} (cg post : PMF W) (lr : ) (h0 : 0 lr) (h1 : lr 1) (w : W) :
                    (updateCG cg post lr h0 h1).toRealFn w = (1 - lr) * cg.toRealFn w + lr * post.toRealFn w

                    The update in ℝ: (1 − lr)·CG + lr·posterior, no rescaling needed.

                    theorem Anderson2021.updateCG_matches_linear_learning {W : Type u_1} (cg post : PMF W) (lr : ) (h0 : 0 lr) (h1 : lr 1) (w : W) :
                    (updateCG cg post lr h0 h1).toRealFn w = (1 - lr) * cg.toRealFn w + (1 - (1 - lr)) * post.toRealFn w

                    Bridge to [Luc59] linear learning: the CommonGround update has the same algebraic form as LinearLearner.update with retention rate 1 - lr and reinforcement target posterior:

                    CommonGround'(w) = (1 - lr) · CommonGround(w) + lr · posterior(w)     [Anderson]
                    v'(a)  = α · v(a) + (1 - α) · r(a)                [Luce §4.C]
                    

                    Setting α = 1 - lr and r = posterior makes the formulas identical. Multi-turn conversation IS iterated learning over distributions.

                    theorem Anderson2021.updateCG_lr_zero {W : Type u_1} (cg post : PMF W) (w : W) :
                    (updateCG cg post 0 ).toRealFn w = cg.toRealFn w

                    With learning rate 0, the CommonGround is unchanged (full retention).

                    theorem Anderson2021.updateCG_lr_one {W : Type u_1} (cg post : PMF W) (w : W) :
                    (updateCG cg post 1 ).toRealFn w = post.toRealFn w

                    With learning rate 1, the CommonGround is replaced by the posterior.

                    Conversation State #

                    structure Anderson2021.ConversationState (W : Type u_1) :
                    Type u_1

                    The state of a two-participant conversation (Figure 2).

                    Tracks the common ground (distributional), each participant's private beliefs, and the learning rate for updates. In the shared CommonGround model (§5.1, Figure 4), both participants access the same cg. In the approximate CommonGround model (§5.2, Figure 6), each maintains a separate approximation (not yet formalized).

                    The distributional CommonGround enters the RSA model at two points (Figure 4): inside the literal listener and as the pragmatic listener's prior. At each turn the chain is rebuilt at the current CommonGround (conversationStep).

                    • cg : PMF W
                    • belA : PMF W
                    • belB : PMF W
                    • lr :
                    • speakerIsA : Bool
                    Instances For
                      noncomputable def Anderson2021.ConversationState.initial {W : Type u_1} [Fintype W] [Nonempty W] (belA belB : PMF W) (lr : ) :

                      Initial conversation state: uniform CommonGround, specified beliefs, A speaks first.

                      Equations
                      Instances For

                        Observation Sampling #

                        noncomputable def Anderson2021.weightedSample {W : Type u_1} (bel : PMF W) :
                        W

                        Weighted sampling (§7.1): a world sampled in proportion to the speaker's belief — truthful (zero-probability worlds never sampled) but flip-flop-prone for noncommittal speakers (Figure 12).

                        Equations
                        Instances For
                          noncomputable def Anderson2021.thresholdedSample {W : Type u_1} (bel : PMF W) (θ : ) :
                          W

                          Thresholded sampling (§7.1.1): worlds below the confidence threshold are filtered out; with none left the speaker passes with the null utterance and "all updates are skipped" (Figure 13).

                          Equations
                          Instances For
                            noncomputable def Anderson2021.differenceSample {W : Type u_1} (bel cg : PMF W) :
                            W

                            Difference-based sampling (§7.2.1): worlds weighted by "the largest (positive) difference in probability between the speaker's beliefs and the Common Ground" — the paper's own truncation at zero (its fn. 14: probability reductions are also informative but assertions can't convey them). Non-redundancy as Gricean Quantity.

                            Equations
                            Instances For
                              theorem Anderson2021.differenceSample_nonneg {W : Type u_1} (bel cg : PMF W) (w : W) :

                              BToM Shared-State Update #

                              noncomputable def Anderson2021.toBToMSharedUpdate {W : Type u_1} {U : Type u_2} [Fintype W] (posteriorFn : UPMF W) (lr : ) (hlr : 0 lr) (hlr1 : lr 1) :
                              PMF WUWPMF W

                              Linglib bridge (not in the paper): the common-ground update has the type BToMModel.sharedUpdate expects (Shared → Action → World → Shared at Shared := PMF W), instantiating BToM's discourse dynamics with a distributional shared state. The World argument is unused — the update reads the utterance-derived posterior, never the true world.

                              Equations
                              Instances For

                                The Figure-5 beliefs #

                                The §5.1.1 scenario — A thinks the person is Nancy, B thinks Katie — is qualitative in the paper; the 3:1 peak (renormalised to ½ there, 1/6 elsewhere) is this file's illustrative instantiation.

                                noncomputable def Anderson2021.peakedBeliefs (p : MFWorld) :

                                Beliefs peaked on one world: illustrative 3:1 weights, renormalised.

                                Equations
                                Instances For
                                  noncomputable def Anderson2021.beliefsA :

                                  A believes Nancy; B believes Katie (Figure 5).

                                  Equations
                                  Instances For
                                    theorem Anderson2021.peakedBeliefs_weight (p w : MFWorld) :
                                    (peakedBeliefs p).toRealFn w = (if w = p then 3 else 1) / 6

                                    Peaked beliefs favor their peak over every other world.

                                    The Figure-4 model on ℚ≥0 scores #

                                    The Figure-4 chain #

                                    Shared-CG RSA: L0 ∝ ⟦u⟧·CG, S1 ∝ LitList (α = 1, fn. 3), L1 ∝ PragSpeak·CG, with the endorsement speaker S2 ∝ L1.

                                    The score chain #

                                    def Anderson2021.l0Score (cg : MFWorldℚ≥0) (u : MFUtterance) :
                                    MFWorldℚ≥0

                                    CG-weighted literal listener ([And21] Figure 4: L0 ∝ ⟦u⟧·CG).

                                    Equations
                                    Instances For
                                      def Anderson2021.s1Score (cg : MFWorldℚ≥0) (w : MFWorld) :
                                      MFUtteranceℚ≥0

                                      Pragmatic speaker ([And21] Figure 4: S1 ∝ LitList; fn. 3: the softmax terms are omitted and probabilities renormalized, i.e. α = 1 and no cost — the speaker weight IS the literal-listener value).

                                      Equations
                                      Instances For
                                        def Anderson2021.l1Score (cg : MFWorldℚ≥0) (u : MFUtterance) :
                                        MFWorldℚ≥0

                                        Pragmatic listener ([And21] Figure 4: L1 ∝ PragSpeak·CG).

                                        Equations
                                        Instances For
                                          def Anderson2021.s2Score (cg : MFWorldℚ≥0) (w : MFWorld) :
                                          MFUtteranceℚ≥0

                                          Endorsement speaker: S2(u|w) ∝ L1(w|u) (uniform utterance prior), the standard endorsement inversion of L1 over utterances.

                                          Equations
                                          Instances For

                                            The prior-transparency mechanism #

                                            Worlds with the same truth profile get identical speaker columns — the common-ground factor cancels out of S1's normalization (PMF.normalizeScores_mul_left) — so L1 orders them purely by their common-ground weights. Every tie and tie-break below is an instance.

                                            theorem Anderson2021.s1Score_congr (cg : MFWorldℚ≥0) {w₁ w₂ : MFWorld} (h₁ : cg w₁ 0) (h₂ : cg w₂ 0) (hw : ∀ (u : MFUtterance), mfMeaning u w₁ = mfMeaning u w₂) :
                                            s1Score cg w₁ = s1Score cg w₂

                                            Same truth profile, same speaker column: the CG weight scales the whole L0 row and cancels under normalization.

                                            theorem Anderson2021.l1Score_lt_iff (cg : MFWorldℚ≥0) {u : MFUtterance} {w₁ w₂ : MFWorld} (h₁ : cg w₁ 0) (h₂ : cg w₂ 0) (hw : ∀ (u' : MFUtterance), mfMeaning u' w₁ = mfMeaning u' w₂) (hu : s1Score cg w₁ u 0) :
                                            l1Score cg u w₁ < l1Score cg u w₂ cg w₁ < cg w₂

                                            L1 orders same-profile worlds by their CG weights alone.

                                            noncomputable def Anderson2021.s1Turn1 :

                                            Turn-1 speaker (uniform CommonGround, [And21] Figure 2: CG = Uniform(worlds)).

                                            Equations
                                            Instances For
                                              noncomputable def Anderson2021.l1Turn1 :

                                              Turn-1 pragmatic listener.

                                              Equations
                                              Instances For
                                                noncomputable def Anderson2021.s2Turn1 :

                                                Turn-1 endorsement speaker.

                                                Equations
                                                Instances For
                                                  theorem Anderson2021.s1Turn1_true {u : MFUtterance} {w : MFWorld} (hw : mfMeaning u w = true) (hn : u MFUtterance.null) :
                                                  (s1Turn1 w) u = (2 / 5)

                                                  Turn-1 speaker values ([And21] Figure-4 equations at the uniform CG; derived exact rationals — Figure 5 reports the qualitative profile): 2/5 on each true specific utterance, 1/5 on null, 0 off support.

                                                  theorem Anderson2021.s1Turn1_false {u : MFUtterance} {w : MFWorld} (hw : mfMeaning u w = false) :
                                                  (s1Turn1 w) u = 0
                                                  theorem Anderson2021.l1Turn1_true {u : MFUtterance} {w : MFWorld} (hw : mfMeaning u w = true) (hn : u MFUtterance.null) :
                                                  (l1Turn1 u) w = (1 / 2)

                                                  Turn-1 listener values (derived; L1 ∝ PragSpeak·CG, Figure 4): 1/2 on each world satisfying a specific utterance, 1/4 on every world after null, 0 off support.

                                                  theorem Anderson2021.l1Turn1_false {u : MFUtterance} {w : MFWorld} (hw : mfMeaning u w = false) :
                                                  (l1Turn1 u) w = 0

                                                  Turn-1 predictions #

                                                  The null utterance conveys nothing: L1 stays uniform.

                                                  Turn 2 (Post-Update Prior) #

                                                  def Anderson2021.cgTurn2 :
                                                  MFWorldℚ≥0

                                                  CommonGround weights after hearing "studyHumanity" at turn 1.

                                                  After L1 processes "studyHumanity" with uniform prior, the posterior concentrates on nancy and sally (the German-studying worlds): L1(studyHumanity) = [0, 0, 1/2, 1/2]. Updating the normalized uniform CommonGround [1/4, 1/4, 1/4, 1/4] via updateCG with lr=0.2 (footnote 9) gives:

                                                  CommonGround'(w) = 0.8 · 1/4 + 0.2 · L1(w)
                                                  CommonGround' = [1/5, 1/5, 3/10, 3/10]
                                                  

                                                  The weights [2, 2, 3, 3] are proportional to [1/5, 1/5, 3/10, 3/10], which is the exact post-update CommonGround from the paper's Figure 5, panel 1A. Since RSA normalizes, proportional weights give identical predictions.

                                                  Equations
                                                  Instances For
                                                    noncomputable def Anderson2021.s1Turn2 :

                                                    Turn-2 speaker at the post-update CommonGround.

                                                    Equations
                                                    Instances For

                                                      Turn-2 predictions #

                                                      The key multi-turn prediction: turn 1 ties Nancy and Katie under "likeOutdoors"; the CG update breaks the tie toward Nancy.

                                                      The CG-adapted speaker #

                                                      The CG-weighted L0 makes speakers prefer new over redundant information: after the update, Nancy's speaker switches to "likeOutdoors" (re-asserting "studyHumanity" is redundant at 3:3 weights) and Ina's to "studyScience" (Sally now dominates the indoor partition). At turn 1 both pairs were symmetric (s1_turn1_informativity).

                                                      The endorsement speaker inverts L1: for Nancy, "studyHumanity" beats the false "studyScience" and ties the symmetric "likeOutdoors".

                                                      Parametric RSA and Conversation Step #

                                                      noncomputable def Anderson2021.conversationStep (cg : MFWorldℚ≥0) (u : MFUtterance) (lr : ) (hlr : 0 lr) (hlr1 : lr 1) :

                                                      One Figure-2 loop step on the ℚ≥0 face: build the chain at the current common ground, take the L1 posterior, and mix it in at the learning rate. A dead score row — an utterance contradicting the whole common ground — skips the update (§7.1's null behaviour).

                                                      Equations
                                                      • One or more equations did not get rendered due to their size.
                                                      Instances For

                                                        With lr = 0, the conversation step leaves the CommonGround unchanged.

                                                        Qualitative information-sharing properties #

                                                        An informative utterance concentrates L1 above the uniform null baseline (1/2 vs 1/4 on Nancy).

                                                        Bridge to Classical CommonGround Update #

                                                        theorem Anderson2021.lr_one_excludes_false_worlds (cg post : PMF MFWorld) (w : MFWorld) (h : post.toRealFn w = 0) :

                                                        The [Sta02] set-intersection update survives only at the lr = 1 limit: with the prior discarded, a zero-posterior world leaves the context set.

                                                        theorem Anderson2021.graded_update_keeps_false_world (cg post : PMF MFWorld) (w : MFWorld) (hcg : 0 < cg.toRealFn w) (lr : ) (h0 : 0 lr) (h1 : lr < 1) :

                                                        The graded divergence (fn. 7: "worlds can regain probability"): at any lr < 1 the prior keeps a ruled-out world alive with mass (1 − lr)·cg w — the update never deletes a world the prior supports.

                                                        Exact Numerical Predictions (turn 1) #

                                                        Exact turn-1 values #

                                                        theorem Anderson2021.l1_null_val (w : MFWorld) :
                                                        (l1Turn1 MFUtterance.null) w = (1 / 4)

                                                        Null gives uniform L1: every world has the same S1(null|w) by the domain's symmetry, so L1(w|null) = CommonGround(w)/Σ CommonGround = 1/4.

                                                        Approximate CommonGround Model (§5.2, Figure 6) #

                                                        Approximate common ground #

                                                        Separate speaker/listener CG representations (Figure 6): the listener's Pragmatic Listener runs on their own beliefs, so divergence can arise when those differ from the common ground.

                                                        structure Anderson2021.ApproxCGState (W : Type u_1) :
                                                        Type u_1

                                                        State for the Approximate CommonGround model (§5.2, Figure 6).

                                                        • cgA : PMF W
                                                        • cgB : PMF W
                                                        • belA : PMF W
                                                        • belB : PMF W
                                                        • lr :
                                                        • speakerIsA : Bool
                                                        Instances For
                                                          noncomputable def Anderson2021.ApproxCGState.initial {W : Type u_1} [Fintype W] [Nonempty W] (belA belB : PMF W) (lr : ) :
                                                          Equations
                                                          Instances For
                                                            noncomputable def Anderson2021.approxL1 (cgL belL : MFWorldℚ≥0) (u : MFUtterance) :

                                                            Approximate comprehension listener ([And21] Figure 6): L0/S1 run over the listener's CommonGround approximation CG_L, but the Bayesian inversion uses the listener's private beliefs B_L as the prior.

                                                            Equations
                                                            Instances For

                                                              When beliefs equal the CommonGround, the approximate model reduces to the shared CommonGround model — the split is only meaningful when they diverge.

                                                              Belief Update Model (§6, Figure 8) #

                                                              The belief update model extends the conversation system by also updating participants' private beliefs. After comprehension, the listener updates their beliefs via the same linear rule as CommonGround update:

                                                              Bel'(w) = (1 - lr_bel) · Bel(w) + lr_bel · posterior(w)
                                                              

                                                              The speaker does not update beliefs (they already knew the info). Different learning rates for CommonGround vs beliefs allow modeling:

                                                              structure Anderson2021.BeliefUpdateState (W : Type u_1) :
                                                              Type u_1

                                                              State for the belief update model (Figure 8). Extends approximate CommonGround with separate learning rates for CommonGround and beliefs.

                                                              • cgA : PMF W
                                                              • cgB : PMF W
                                                              • belA : PMF W
                                                              • belB : PMF W
                                                              • cgLR :

                                                                Learning rate for CommonGround updates.

                                                              • belLR :

                                                                Learning rate for belief updates (may be lower for skeptical agents).

                                                              • speakerIsA : Bool
                                                              Instances For
                                                                noncomputable def Anderson2021.BeliefUpdateState.initial {W : Type u_1} [Fintype W] [Nonempty W] (belA belB : PMF W) (cgLR belLR : ) :
                                                                Equations
                                                                • One or more equations did not get rendered due to their size.
                                                                Instances For
                                                                  theorem Anderson2021.belief_update_is_linear_learning {W : Type u_1} [Fintype W] (bel post : PMF W) (lr : ) (h0 : 0 lr) (h1 : lr 1) (w : W) :
                                                                  (updateCG bel post lr h0 h1).toRealFn w = (1 - lr) * bel.toRealFn w + lr * post.toRealFn w

                                                                  Belief update is algebraically identical to CommonGround update — both are instances of [Luc59]'s linear learning rule. The only difference is the learning rate parameter and the interpretation (private vs shared).

                                                                  Noncommittal Speaker Problem (§7.1) #

                                                                  A speaker with uniform beliefs (no private information) assigns equal weight to all worlds under weighted sampling. Since no observation is more probable than any other, the speaker makes random assertions about worlds they don't know, causing the CommonGround to flip-flop (Figure 12).

                                                                  Solutions:

                                                                  1. Threshold sampling (§7.1.1): filter out low-confidence worlds; a noncommittal speaker passes (null utterance) instead of guessing.
                                                                  2. Uncertainty-based lr (§6.3): scale the CommonGround update rate by the listener's uncertainty, so confident listeners resist random input.
                                                                  theorem Anderson2021.uniform_weighted_constant (w₁ w₂ : MFWorld) :
                                                                  weightedSample (PMF.uniformOfFintype MFWorld) w₁ = weightedSample (PMF.uniformOfFintype MFWorld) w₂

                                                                  Uniform beliefs assign equal weight to all worlds under weighted sampling — a noncommittal speaker has no basis for choosing.

                                                                  theorem Anderson2021.threshold_filters_uniform (θ : ) ( : 1 < θ) (w : MFWorld) :
                                                                  thresholdedSample (PMF.uniformOfFintype MFWorld) θ w = 0

                                                                  Threshold sampling filters out all worlds when beliefs don't exceed the threshold. Every mass of a distribution is ≤ 1, so any θ > 1 produces zero weight everywhere — the speaker passes (Figure 13).

                                                                  theorem Anderson2021.threshold_preserves_confident {W : Type u_1} (bel : PMF W) (θ : ) (w : W) (h : bel.toRealFn w θ) :
                                                                  thresholdedSample bel θ w = bel.toRealFn w

                                                                  Threshold preserves confident worlds: weights above θ pass through.

                                                                  Redundancy and Difference Sampling (§7.2) #

                                                                  Under weighted sampling, a speaker whose beliefs match the CommonGround keeps repeating already-shared information (Figure 14). Difference-based sampling fixes this by weighting worlds by max(0, Bel(w) - CommonGround(w)): worlds already established in the CommonGround get zero weight.

                                                                  Combined with thresholding, thresholded difference-based sampling gives the best behavior (Figure 15): informed speakers contribute new information, noncommittal speakers pass.

                                                                  theorem Anderson2021.difference_zero_when_aligned {W : Type u_1} (cg : PMF W) (w : W) :
                                                                  differenceSample cg cg w = 0

                                                                  When beliefs match the CommonGround exactly, difference sampling assigns zero weight to all worlds — nothing new to contribute.

                                                                  theorem Anderson2021.difference_positive_when_exceeds {W : Type u_1} (bel cg : PMF W) (w : W) (h : cg.toRealFn w < bel.toRealFn w) :
                                                                  0 < differenceSample bel cg w

                                                                  Difference sampling assigns positive weight when belief exceeds CommonGround — these worlds carry new information not yet in the common ground.