Documentation

Linglib.Studies.DegenEtAl2020

[DHG+20]: When Redundancy Is Useful #

[FG12] [DR95] [EBF06] [Gri75] [KD21] [WKM15]

Standard RSA with a Boolean semantics predicts no preference for overmodified referring expressions — if "small" already identifies the target, adding "blue" is literally uninformative. Yet speakers routinely overmodify, more with color than with size. [DHG+20] resolve this by relaxing the semantics to a continuous meaning φ(u, o) ∈ [0,1] (a Product-of-Experts over noisy feature channels): redundant modifiers then carry real information, and the color/size asymmetry follows from color channels being less noisy than size channels.

The model is the mathlib-PMF RSA pipeline (RSA.L0OfMeaning / RSA.S1Belief, [FG12]): the literal listener L0(·|u) : PMF World normalises φ, and the speaker S1(·|w) : PMF U is S1(u|w) ∝ L0(w|u)^α · cost(u). With α = 1 and zero cost (fun _ => 1), each prediction is one application of S1Belief_apply_lt_iff_score_lt — the partition cancels, leaving an L0 comparison in ℝ≥0∞.

Main results #

Verified data (prose, per [DHG+20]) #

Effect sizes are documented here, not encoded as Lean data. Exp 1 (§3): main effect of sufficient property β = 3.54, SE = .22; scene-variation × property interaction β = 2.26, SE = .74; BDA-fitted noise (Figure 10) MAP x_color = .88, HDI [.85, .92]; x_size = .79, HDI [.76, .80]; near-zero costs β_c ≈ .02/.03, confirming color > size discrimination and the No-Brevity regime. Exp 2 (§4.3): typicality β = −4.17, informativeness β = −5.56, color-competitor β = 0.71 (all p < .0001); [WKM15] found the same typicality direction (β = −2.36). Exp 3 (§5.2): sub-necessary β = 2.11, basic-vs-super β = .60, typicality β = 4.82, length β = −.95, frequency β = .08 (NS); the BDA fits a substantial length cost (β_L = 2.69), so — unlike modifiers — nominal choice is not in the No-Brevity regime.

Modifier scene (Exp 1) #

Three pins varying in size and colour; the target is the small blue pin.

Instances For
    @[implicit_reducible]
    Equations
    @[implicit_reducible]
    Equations
    def DegenEtAl2020.instReprWorld.repr :
    WorldStd.Format
    Equations
    Instances For

      The seven referring expressions (each + implicit "pin"): four single adjectives and three size+colour pairs.

      Instances For
        @[implicit_reducible]
        Equations
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For
          @[implicit_reducible]
          Equations
          • One or more equations did not get rendered due to their size.
          @[reducible, inline]

          The target object.

          Equations
          Instances For
            noncomputable def DegenEtAl2020.φ :
            UtteranceWorldENNReal

            Continuous meaning φ(u, o) ∈ ℝ≥0∞ via a Product of Experts: a single adjective is its noise channel, a pair the product of its two channels.

            Equations
            Instances For
              noncomputable def DegenEtAl2020.L0 (u : Utterance) :
              PMF World

              Literal listener L0(·|u) : PMF World, normalising the continuous meaning.

              Equations
              Instances For
                theorem DegenEtAl2020.L0_small_target :
                (L0 Utterance.small) target = ENNReal.ofReal (2 / 3)

                L0(target | "small") = 2/3 — size is sufficient but noisy (not 1).

                theorem DegenEtAl2020.L0_smallBlue_target :
                (L0 Utterance.smallBlue) target = ENNReal.ofReal (99 / 124)

                L0(target | "small blue") = 99/124 — the redundant colour sharpens via PoE.

                theorem DegenEtAl2020.L0_blue_target :
                (L0 Utterance.blue) target = ENNReal.ofReal (99 / 199)

                L0(target | "blue") = 99/199 — colour is redundant (two objects are blue).

                noncomputable def DegenEtAl2020.S1 (w : World) :

                Pragmatic speaker S1(·|w) ∝ L0(w|u) (α = 1, zero cost), a PMF Utterance.

                Equations
                Instances For

                  Main result: S1 strictly prefers the overmodified "small blue" over the sufficient "small" — overmodification is rational under noisy perception.

                  The sufficient "small" beats the redundant "blue" (the size principle).

                  φ grounded in the noise channels #

                  Boolean baseline #

                  Boolean (zero-noise) meaning: a feature matches (true) or not.

                  Equations
                  Instances For
                    noncomputable def DegenEtAl2020.boolL0 (u : Utterance) :
                    PMF World

                    Boolean literal listener: uniform on the extension.

                    Equations
                    Instances For
                      noncomputable def DegenEtAl2020.boolS1 (w : World) :

                      Boolean pragmatic speaker.

                      Equations
                      Instances For

                        The Boolean model shows no overmodification preference: "small" already identifies the target perfectly, so adding "blue" adds nothing.

                        No-Brevity bridge #

                        cs-RSA operates in [DR95]'s No-Brevity regime (zero cost, fitted β_c ≈ 0), and Q1/Q2 are independent sub-maxims — so over-description is Q1 (informativity under noise), not a Q2 violation.

                        Nominal scene (Exp 3): overspecification via typicality #

                        The same mechanism with a typicality meaning for nouns: a graded φ_typ ∈ [0,1] plays the role noise plays for adjectives. Values are illustrative (the paper uses elicited typicality norms): the dalmatian is a very typical dalmatian, a typical dog, a moderate animal.

                        Target dalmatian among a cat and a bird; "dog" is basic-sufficient.

                        Instances For
                          @[implicit_reducible]
                          Equations
                          def DegenEtAl2020.instReprNomWorld.repr :
                          NomWorldStd.Format
                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For
                            @[implicit_reducible]
                            Equations
                            • One or more equations did not get rendered due to their size.

                            Noun utterances at three taxonomic levels.

                            Instances For
                              @[implicit_reducible]
                              Equations
                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For
                                @[implicit_reducible]
                                Equations
                                • One or more equations did not get rendered due to their size.
                                noncomputable def DegenEtAl2020.nomL0 (u : NomUtterance) :

                                Nominal literal listener.

                                Equations
                                Instances For

                                  L0(dalmatian | "dalmatian") = 95/97 — near-perfect via typicality.

                                  L0(dalmatian | "dog") = 8/9 — the basic term discriminates well.

                                  L0(dalmatian | "animal") = 1/3 — no discrimination.

                                  noncomputable def DegenEtAl2020.nomS1 (w : NomWorld) :

                                  Nominal pragmatic speaker.

                                  Equations
                                  Instances For

                                    Nominal overspecification: S1 prefers the subordinate "dalmatian" over the sufficient basic "dog" — the noun analogue of csrsa_overmod_preferred.

                                    noncomputable def DegenEtAl2020.nomBoolL0 (u : NomUtterance) :

                                    Boolean nominal literal listener.

                                    Equations
                                    Instances For
                                      noncomputable def DegenEtAl2020.nomBoolS1 (w : NomWorld) :

                                      Boolean nominal speaker.

                                      Equations
                                      Instances For

                                        The unified mechanism #

                                        Capstone: continuous semantics makes both overmodification (Exp 1) and overspecification (Exp 3) rational, while the Boolean model predicts neither — one mechanism, two phenomena, only the meaning function changes.