Documentation

Linglib.Studies.BarnettEtAl2022

[BGH22]: the weak evidence effect #

RSA with a persuasive speaker whose utility adds β · ln L0(w*|u) for a goal state w* to the epistemic term (eq. 6; β = 0 is standard RSA). With w* = "longer" the speaker weight is L0(longer|u)^β · 𝟙[u ∈ w] (eq. 8; α = 1, so the exponent is β, here 2). A pragmatic listener who expects the strongest available evidence reads weak positive evidence as implying no stronger evidence exists — so it backfires.

The paper's Stick Contest (5 sticks from {1,…,9}, 126 worlds) is formalized at 3 sticks from {1,…,5} (10 worlds, midpoint 3), preserving the load-bearing structure: the prior favors ¬longer (2/5) and L0(longer|·) is monotone in stick length.

Main results #

Implementation notes #

The uniform world prior cancels from both listeners, so the chain is two PMF.ofScores levels over ℚ≥0 scores and every prediction is one event-mass kernel certificate.

Domain Types #

Stick lengths 1–5

Instances For
    @[implicit_reducible]
    Equations
    def BarnettEtAl2022.instReprStick.repr :
    StickStd.Format
    Equations
    Instances For
      @[implicit_reducible]
      Equations
      • One or more equations did not get rendered due to their size.

      Worlds: sets of 3 distinct sticks from {1,...,5}. C(5,3) = 10 worlds.

      Instances For
        @[implicit_reducible]
        Equations
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For
          @[implicit_reducible]
          Equations
          • One or more equations did not get rendered due to their size.

          Whether a stick is available in a given world.

          Equations
          Instances For

            A world is "longer" if the average stick length exceeds the midpoint (3); equivalently, the three sticks sum past 9. 4 of 10 worlds qualify.

            Equations
            Instances For

              Persuasive-speaker scores #

              The literal listener's longer-probability per stick: each stick appears in six worlds, and l0LongerQ_eq_eventMass certifies these fractions against the chain.

              Equations
              Instances For

                l0LongerQ agrees with the chain: it is the literal listener's longer-event mass at each stick.

                Prior probability of "longer": 4 out of 10 worlds

                Equations
                Instances For
                  def BarnettEtAl2022.s1Score (w : StickWorld) (u : Stick) :
                  ℚ≥0

                  Persuasive-speaker weight (eq. 8 at β = 2): L0(longer|u)² · 𝟙[u ∈ w].

                  Equations
                  Instances For

                    The chain #

                    The world prior is uniform, so it cancels from both listeners: L0 is the normalized extension indicator, the persuasive speaker normalizes s1Score per world, and L1 is the normalized speaker column.

                    noncomputable def BarnettEtAl2022.l0 (u : Stick) :

                    Literal listener over worlds (uniform prior conditioned on the extension).

                    Equations
                    Instances For
                      noncomputable def BarnettEtAl2022.l0Event (u : Stick) (P : StickWorldBool) :
                      ENNReal

                      Event marginal of the literal listener.

                      Equations
                      Instances For
                        noncomputable def BarnettEtAl2022.s1Persuade (w : StickWorld) :
                        PMF Stick

                        Persuasive speaker (eq. 8 at β = 2).

                        Equations
                        Instances For
                          noncomputable def BarnettEtAl2022.l1w (u : Stick) :

                          Pragmatic listener over worlds: the normalized speaker column (the uniform prior cancels).

                          Equations
                          Instances For
                            noncomputable def BarnettEtAl2022.l1Event (u : Stick) (P : StickWorldBool) :
                            ENNReal

                            Event marginal of the pragmatic listener.

                            Equations
                            Instances For

                              Predictions — L0 #

                              L0(longer|s5) > L0(¬longer|s5): stick 5 is positive evidence for "longer". 4 of 6 worlds containing s5 are longer, vs 2 not-longer.

                              L0(longer|s5) > L0(longer|s4): stick 5 provides stronger evidence than s4.

                              L0(¬longer|s1) > L0(longer|s1): stick 1 is evidence against "longer". Only 1 of 6 worlds containing s1 is longer.

                              L0(longer|·) is monotonically increasing in stick length. This structural property ensures the simplified domain faithfully mirrors the paper's full domain (Appendix Theorem 2).

                              Predictions — L1 (weak evidence effect) #

                              The weak evidence effect: at β = 2, showing stick 4 — positive literal evidence — decreases the pragmatic listener's belief in "longer" below the ¬longer mass (p. 172: "the absence of strong evidence from a speaker who would be highly motivated to show it statistically implies that no such evidence exists").

                              Strong evidence works: the strongest available evidence cannot be explained away by the absence of something better.

                              Bridge Theorems #

                              At β=1, the persuasive utility equals combinedWeighted(1,1,...).

                              The paper's Eq. 6 (additive: U_epi + β·U_pers) equals (1+β) · combined(β/(1+β), U_epi, U_pers).

                              Connection to ArgumentativeStrength: stick 4 has positive argumentative strength for the goal "longer" (L0(longer|s4) = 1/2 > 2/5 = P(longer)).

                              Stick 3 does NOT have positive argumentative strength (L0(longer|s3) = 1/3 < 2/5 = P(longer)).

                              The weak evidence effect shows that argumentatively positive evidence can still backfire under a pragmatic listener model. This is the core insight connecting [BGH22] to [CF21]'s work on argumentative strength.

                              Stick 4 has positive argStr at L0 (1/2 > 2/5), yet L1 assigns more mass to ¬longer than longer after seeing s4.

                              Experimental Design & Behavioral Data #

                              Listener type inferred from speaker expectation phase

                              Instances For
                                @[implicit_reducible]
                                Equations
                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  Evidence strength conditions (distance from midpoint 5")

                                  Instances For
                                    @[implicit_reducible]
                                    Equations
                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      Which contestant goes first

                                      Instances For
                                        @[implicit_reducible]
                                        Equations
                                        Equations
                                        • One or more equations did not get rendered due to their size.
                                        Instances For

                                          Stick Contest design parameters

                                          • nSticks :
                                          • minLength :
                                          • maxLength :
                                          • midpoint :
                                          • nParticipants :
                                          Instances For
                                            Equations
                                            • One or more equations did not get rendered due to their size.
                                            Instances For

                                              The actual experimental parameters

                                              Equations
                                              Instances For

                                                Proportion expecting strongest evidence (pragmatic listeners)

                                                Equations
                                                Instances For

                                                  Proportion expecting weaker evidence (literal listeners)

                                                  Equations
                                                  Instances For

                                                    Key interaction: speaker expectations × evidence strength. t(718) = 5.2, p < 0.001 (p. 175)

                                                    • tStatistic :
                                                    • df :
                                                    • pLessThan :
                                                    Instances For
                                                      Equations
                                                      • One or more equations did not get rendered due to their size.
                                                      Instances For
                                                        Equations
                                                        Instances For

                                                          Behavioral result for a listener group

                                                          • listenerType : ListenerType
                                                          • nParticipants :
                                                          • meanSlider :
                                                          • ci95Lower : Option
                                                          • ci95Upper : Option
                                                          Instances For
                                                            Equations
                                                            • One or more equations did not get rendered due to their size.
                                                            Instances For

                                                              Pragmatic group: weak evidence backfires (mean below 50). 95% CI: [32.3, 37.3] (paper p. 175).

                                                              Equations
                                                              • One or more equations did not get rendered due to their size.
                                                              Instances For

                                                                Literal group: no weak evidence effect (mean at 50). CIs not reported in paper.

                                                                Equations
                                                                Instances For

                                                                  Pragmatic group shows backfire: mean significantly below 50 (midpoint)

                                                                  Literal group shows no backfire: mean at midpoint

                                                                  The two groups differ in the predicted direction

                                                                  Model Comparison (Table 1) #

                                                                  Model families compared

                                                                  Instances For
                                                                    @[implicit_reducible]
                                                                    Equations
                                                                    Equations
                                                                    • One or more equations did not get rendered due to their size.
                                                                    Instances For

                                                                      Model variant (how individual differences are handled)

                                                                      Instances For
                                                                        @[implicit_reducible]
                                                                        Equations
                                                                        Equations
                                                                        • One or more equations did not get rendered due to their size.
                                                                        Instances For

                                                                          Model comparison result from Table 1

                                                                          Instances For
                                                                            Equations
                                                                            • One or more equations did not get rendered due to their size.
                                                                            Instances For

                                                                              Table 1 data

                                                                              Equations
                                                                              • One or more equations did not get rendered due to their size.
                                                                              Instances For

                                                                                The RSA speaker-dependent model has the best (highest) log-likelihood

                                                                                theorem BarnettEtAl2022.rsa_speaker_dep_best_waic :
                                                                                -164 / 10 < -133 / 10

                                                                                The RSA speaker-dependent model has the best (lowest) WAIC

                                                                                Fitted Parameters #

                                                                                Fitted parameters for the best model (RSA speaker-dependent). β̂ = 2.26 and mixture weights from main text (p. 178); β̄ = 2.03 and ō = −0.13 from Fig 3B caption (p. 177).

                                                                                • betaMAP :
                                                                                • betaCV :
                                                                                • responseOffsetCV :
                                                                                • pragmaticMixWeight :
                                                                                • literalMixWeight :
                                                                                Instances For
                                                                                  Equations
                                                                                  • One or more equations did not get rendered due to their size.
                                                                                  Instances For
                                                                                    Equations
                                                                                    • BarnettEtAl2022.bestModelParams = { betaMAP := 226 / 100, betaCV := 203 / 100, responseOffsetCV := -13 / 100, pragmaticMixWeight := 99 / 100, literalMixWeight := 1 / 10 }
                                                                                    Instances For

                                                                                      β > 0 provides strong support for non-zero persuasive bias

                                                                                      Pragmatic group is best explained by J1 (pragmatic listener model)

                                                                                      Literal group is best explained by J0 (literal listener model)

                                                                                      Model–Data Connection #

                                                                                      The RSA model predicts the qualitative pattern underlying the observed interaction between listener type and evidence strength (t(718) = 5.2, p < 0.001). The literal model (L0) assigns s4 positive argumentative strength, predicting no backfire. The pragmatic model (L1) shows backfire. The experiment confirms exactly this divergence: pragmatic participants' mean (34.7) falls below neutral (50), while literal participants' mean (50.1) does not.