Documentation

Linglib.Studies.DegenEtAl2020

[DHG+20]: When Redundancy Is Useful #

[FG12] [DR95] [EBF06] [Gri75] [KD21] [WKM15]

Standard RSA with a Boolean semantics predicts no preference for overmodified referring expressions — if "small" already identifies the target, adding "blue" is literally uninformative. Yet speakers routinely overmodify, more with color than with size. [DHG+20] resolve this by relaxing the semantics to a continuous meaning φ(u, o) ∈ [0,1] (a Product-of-Experts over noisy feature channels): redundant modifiers then carry real information, and the color/size asymmetry follows from color channels being less noisy than size channels.

The model is the mathlib-PMF RSA pipeline (RSA.L0OfMeaning / RSA.S1Belief, [FG12]): the literal listener L0(·|u) : PMF World normalises φ, and the speaker S1(·|w) : PMF U is S1(u|w) ∝ L0(w|u)^α · cost(u). With α = 1 and zero cost (fun _ => 1), each prediction is one application of S1Belief_apply_lt_iff_score_lt — the partition cancels, leaving an L0 comparison in ℝ≥0∞.

Main results #

csrsa_overmod_preferred — S1 prefers overmodified "small blue" over "small".
csrsa_sufficient_beats_redundant — "small" (sufficient) beats "blue" (redundant).
bool_no_overmod_preference — the Boolean model shows no overmod preference.
nominal_overspec_preferred / nom_bool_no_overspec — the Exp 3 noun analogue.
unified_continuous_semantics — both phenomena: cs-RSA yes, Boolean no.
noise_grounds_asymmetry, cost_zero_is_no_brevity — the structural bridges.

Verified data (prose, per [DHG+20]) #

Effect sizes are documented here, not encoded as Lean data. Exp 1 (§3): main effect of sufficient property β = 3.54, SE = .22; scene-variation × property interaction β = 2.26, SE = .74; BDA-fitted noise (Figure 10) MAP x_color = .88, HDI [.85, .92]; x_size = .79, HDI [.76, .80]; near-zero costs β_c ≈ .02/.03, confirming color > size discrimination and the No-Brevity regime. Exp 2 (§4.3): typicality β = −4.17, informativeness β = −5.56, color-competitor β = 0.71 (all p < .0001); [WKM15] found the same typicality direction (β = −2.36). Exp 3 (§5.2): sub-necessary β = 2.11, basic-vs-super β = .60, typicality β = 4.82, length β = −.95, frequency β = .08 (NS); the BDA fits a substantial length cost (β_L = 2.69), so — unlike modifiers — nominal choice is not in the No-Brevity regime.

Modifier scene (Exp 1) #

inductive DegenEtAl2020.World :

Three pins varying in size and colour; the target is the small blue pin.

bigBlue : World
bigRed : World
smallBlue : World

Instances For

@[implicit_reducible]

instance DegenEtAl2020.instDecidableEqWorld :

DecidableEq World

Equations

DegenEtAl2020.instDecidableEqWorld x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

@[implicit_reducible]

instance DegenEtAl2020.instReprWorld :

Repr World

Equations

DegenEtAl2020.instReprWorld = { reprPrec := DegenEtAl2020.instReprWorld.repr }

def DegenEtAl2020.instReprWorld.repr :

World → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.
DegenEtAl2020.instReprWorld.repr DegenEtAl2020.World.bigRed prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "DegenEtAl2020.World.bigRed")).group prec✝

Instances For

@[implicit_reducible]

instance DegenEtAl2020.instInhabitedWorld :

Inhabited World

Equations

DegenEtAl2020.instInhabitedWorld = { default := DegenEtAl2020.instInhabitedWorld.default }

@[implicit_reducible]

instance DegenEtAl2020.instFintypeWorld :

Fintype World

Equations

DegenEtAl2020.instFintypeWorld = { elems := { val := ↑DegenEtAl2020.World.enumList, nodup := DegenEtAl2020.World.enumList_nodup }, complete := DegenEtAl2020.instFintypeWorld._proof_1 }

inductive DegenEtAl2020.Utterance :

The seven referring expressions (each + implicit "pin"): four single adjectives and three size+colour pairs.

big : Utterance
small : Utterance
blue : Utterance
red : Utterance
bigBlue : Utterance
bigRed : Utterance
smallBlue : Utterance

Instances For

@[implicit_reducible]

instance DegenEtAl2020.instDecidableEqUtterance :

DecidableEq Utterance

Equations

DegenEtAl2020.instDecidableEqUtterance x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

def DegenEtAl2020.instReprUtterance.repr :

Utterance → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

@[implicit_reducible]

instance DegenEtAl2020.instReprUtterance :

Equations

DegenEtAl2020.instReprUtterance = { reprPrec := DegenEtAl2020.instReprUtterance.repr }

@[implicit_reducible]

instance DegenEtAl2020.instInhabitedUtterance :

Inhabited Utterance

Equations

DegenEtAl2020.instInhabitedUtterance = { default := DegenEtAl2020.instInhabitedUtterance.default }

@[implicit_reducible]

instance DegenEtAl2020.instFintypeUtterance :

Fintype Utterance

Equations

One or more equations did not get rendered due to their size.

@[reducible, inline]

abbrev DegenEtAl2020.target :

The target object.

Equations

DegenEtAl2020.target = DegenEtAl2020.World.smallBlue

Instances For

noncomputable def DegenEtAl2020.φ :

Utterance → World → ENNReal

Continuous meaning φ(u, o) ∈ ℝ≥0∞ via a Product of Experts: a single adjective is its noise channel, a pair the product of its two channels.

Equations

DegenEtAl2020.φ DegenEtAl2020.Utterance.big DegenEtAl2020.World.bigBlue = ENNReal.ofReal DegenEtAl2020.sM✝
DegenEtAl2020.φ DegenEtAl2020.Utterance.big DegenEtAl2020.World.bigRed = ENNReal.ofReal DegenEtAl2020.sM✝
DegenEtAl2020.φ DegenEtAl2020.Utterance.big DegenEtAl2020.World.smallBlue = ENNReal.ofReal DegenEtAl2020.sm✝
DegenEtAl2020.φ DegenEtAl2020.Utterance.small DegenEtAl2020.World.bigBlue = ENNReal.ofReal DegenEtAl2020.sm✝
DegenEtAl2020.φ DegenEtAl2020.Utterance.small DegenEtAl2020.World.bigRed = ENNReal.ofReal DegenEtAl2020.sm✝
DegenEtAl2020.φ DegenEtAl2020.Utterance.small DegenEtAl2020.World.smallBlue = ENNReal.ofReal DegenEtAl2020.sM✝
DegenEtAl2020.φ DegenEtAl2020.Utterance.blue DegenEtAl2020.World.bigBlue = ENNReal.ofReal DegenEtAl2020.cM✝
DegenEtAl2020.φ DegenEtAl2020.Utterance.blue DegenEtAl2020.World.bigRed = ENNReal.ofReal DegenEtAl2020.cm✝
DegenEtAl2020.φ DegenEtAl2020.Utterance.blue DegenEtAl2020.World.smallBlue = ENNReal.ofReal DegenEtAl2020.cM✝
DegenEtAl2020.φ DegenEtAl2020.Utterance.red DegenEtAl2020.World.bigBlue = ENNReal.ofReal DegenEtAl2020.cm✝
DegenEtAl2020.φ DegenEtAl2020.Utterance.red DegenEtAl2020.World.bigRed = ENNReal.ofReal DegenEtAl2020.cM✝
DegenEtAl2020.φ DegenEtAl2020.Utterance.red DegenEtAl2020.World.smallBlue = ENNReal.ofReal DegenEtAl2020.cm✝
DegenEtAl2020.φ DegenEtAl2020.Utterance.bigBlue DegenEtAl2020.World.bigBlue = ENNReal.ofReal (DegenEtAl2020.sM✝ * DegenEtAl2020.cM✝)
DegenEtAl2020.φ DegenEtAl2020.Utterance.bigBlue DegenEtAl2020.World.bigRed = ENNReal.ofReal (DegenEtAl2020.sM✝ * DegenEtAl2020.cm✝)
DegenEtAl2020.φ DegenEtAl2020.Utterance.bigBlue DegenEtAl2020.World.smallBlue = ENNReal.ofReal (DegenEtAl2020.sm✝ * DegenEtAl2020.cM✝)
DegenEtAl2020.φ DegenEtAl2020.Utterance.bigRed DegenEtAl2020.World.bigBlue = ENNReal.ofReal (DegenEtAl2020.sM✝ * DegenEtAl2020.cm✝)
DegenEtAl2020.φ DegenEtAl2020.Utterance.bigRed DegenEtAl2020.World.bigRed = ENNReal.ofReal (DegenEtAl2020.sM✝ * DegenEtAl2020.cM✝)
DegenEtAl2020.φ DegenEtAl2020.Utterance.bigRed DegenEtAl2020.World.smallBlue = ENNReal.ofReal (DegenEtAl2020.sm✝ * DegenEtAl2020.cm✝)
DegenEtAl2020.φ DegenEtAl2020.Utterance.smallBlue DegenEtAl2020.World.bigBlue = ENNReal.ofReal (DegenEtAl2020.sm✝ * DegenEtAl2020.cM✝)
DegenEtAl2020.φ DegenEtAl2020.Utterance.smallBlue DegenEtAl2020.World.bigRed = ENNReal.ofReal (DegenEtAl2020.sm✝ * DegenEtAl2020.cm✝)
DegenEtAl2020.φ DegenEtAl2020.Utterance.smallBlue DegenEtAl2020.World.smallBlue = ENNReal.ofReal (DegenEtAl2020.sM✝ * DegenEtAl2020.cM✝)

Instances For

noncomputable def DegenEtAl2020.L0 (u : Utterance) :

Literal listener L0(·|u) : PMF World, normalising the continuous meaning.

Equations

DegenEtAl2020.L0 u = RSA.L0OfMeaning DegenEtAl2020.φ u ⋯ ⋯

Instances For

theorem DegenEtAl2020.L0_small_target :

(L0 Utterance.small) target = ENNReal.ofReal (2 / 3)

L0(target | "small") = 2/3 — size is sufficient but noisy (not 1).

theorem DegenEtAl2020.L0_smallBlue_target :

(L0 Utterance.smallBlue) target = ENNReal.ofReal (99 / 124)

L0(target | "small blue") = 99/124 — the redundant colour sharpens via PoE.

theorem DegenEtAl2020.L0_blue_target :

(L0 Utterance.blue) target = ENNReal.ofReal (99 / 199)

L0(target | "blue") = 99/199 — colour is redundant (two objects are blue).

noncomputable def DegenEtAl2020.S1 (w : World) :

Pragmatic speaker S1(·|w) ∝ L0(w|u) (α = 1, zero cost), a PMF Utterance.

Equations

DegenEtAl2020.S1 w = RSA.S1Belief DegenEtAl2020.L0 (fun (x : DegenEtAl2020.Utterance) => 1) 1 w ⋯ ⋯

Instances For

theorem DegenEtAl2020.csrsa_overmod_preferred :

(S1 target) Utterance.small < (S1 target) Utterance.smallBlue

Main result: S1 strictly prefers the overmodified "small blue" over the sufficient "small" — overmodification is rational under noisy perception.

theorem DegenEtAl2020.csrsa_sufficient_beats_redundant :

(S1 target) Utterance.blue < (S1 target) Utterance.small

The sufficient "small" beats the redundant "blue" (the size principle).

φ grounded in the noise channels #

theorem DegenEtAl2020.φ_grounded_in_noise :

φ Utterance.blue World.smallBlue = RSA.Noise.colorMatch_e ∧ φ Utterance.blue World.bigRed = RSA.Noise.colorMismatch_e ∧ φ Utterance.small World.smallBlue = RSA.Noise.sizeMatch_e ∧ φ Utterance.small World.bigBlue = RSA.Noise.sizeMismatch_e

φ uses the RSA.Noise channel parameters by construction.

theorem DegenEtAl2020.noise_grounds_asymmetry :

φ Utterance.blue World.smallBlue = RSA.Noise.colorMatch_e ∧ φ Utterance.small World.smallBlue = RSA.Noise.sizeMatch_e ∧ RSA.Noise.colorDiscrimination > RSA.Noise.sizeDiscrimination

The colour > size discrimination ordering grounds the modifier asymmetry.

Boolean baseline #

def DegenEtAl2020.φbool :

Utterance → World → Bool

Boolean (zero-noise) meaning: a feature matches (true) or not.

Equations

Instances For

noncomputable def DegenEtAl2020.boolL0 (u : Utterance) :

Boolean literal listener: uniform on the extension.

Equations

DegenEtAl2020.boolL0 u = RSA.L0OfBoolMeaning DegenEtAl2020.φbool u ⋯

Instances For

theorem DegenEtAl2020.boolL0_small_target :

(boolL0 Utterance.small) target = 1

theorem DegenEtAl2020.boolL0_smallBlue_target :

(boolL0 Utterance.smallBlue) target = 1

noncomputable def DegenEtAl2020.boolS1 (w : World) :

Boolean pragmatic speaker.

Equations

DegenEtAl2020.boolS1 w = RSA.S1Belief DegenEtAl2020.boolL0 (fun (x : DegenEtAl2020.Utterance) => 1) 1 w ⋯ ⋯

Instances For

theorem DegenEtAl2020.bool_no_overmod_preference :

¬(boolS1 target) Utterance.small < (boolS1 target) Utterance.smallBlue

The Boolean model shows no overmodification preference: "small" already identifies the target perfectly, so adding "blue" adds nothing.

No-Brevity bridge #

theorem DegenEtAl2020.cost_zero_is_no_brevity :

DaleReiter1995.BrevityInterpretation.noBrevity.strength = 0 ∧ Pragmatics.GriceanMaxims.QuantityViolation.underInformative.submaxim ≠ Pragmatics.GriceanMaxims.QuantityViolation.overInformative.submaxim

cs-RSA operates in [DR95]'s No-Brevity regime (zero cost, fitted β_c ≈ 0), and Q1/Q2 are independent sub-maxims — so over-description is Q1 (informativity under noise), not a Q2 violation.

Nominal scene (Exp 3): overspecification via typicality #

The same mechanism with a typicality meaning for nouns: a graded φ_typ ∈ [0,1] plays the role noise plays for adjectives. Values are illustrative (the paper uses elicited typicality norms): the dalmatian is a very typical dalmatian, a typical dog, a moderate animal.

inductive DegenEtAl2020.NomWorld :

Target dalmatian among a cat and a bird; "dog" is basic-sufficient.

dalmatian : NomWorld
cat : NomWorld
bird : NomWorld

Instances For

@[implicit_reducible]

instance DegenEtAl2020.instDecidableEqNomWorld :

DecidableEq NomWorld

Equations

DegenEtAl2020.instDecidableEqNomWorld x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

def DegenEtAl2020.instReprNomWorld.repr :

NomWorld → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

@[implicit_reducible]

instance DegenEtAl2020.instReprNomWorld :

Equations

DegenEtAl2020.instReprNomWorld = { reprPrec := DegenEtAl2020.instReprNomWorld.repr }

@[implicit_reducible]

instance DegenEtAl2020.instInhabitedNomWorld :

Inhabited NomWorld

Equations

DegenEtAl2020.instInhabitedNomWorld = { default := DegenEtAl2020.instInhabitedNomWorld.default }

@[implicit_reducible]

instance DegenEtAl2020.instFintypeNomWorld :

Fintype NomWorld

Equations

One or more equations did not get rendered due to their size.

inductive DegenEtAl2020.NomUtterance :

Noun utterances at three taxonomic levels.

sub : NomUtterance
basic : NomUtterance
super : NomUtterance

Instances For

@[implicit_reducible]

instance DegenEtAl2020.instDecidableEqNomUtterance :

DecidableEq NomUtterance

Equations

DegenEtAl2020.instDecidableEqNomUtterance x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

@[implicit_reducible]

instance DegenEtAl2020.instReprNomUtterance :

Repr NomUtterance

Equations

DegenEtAl2020.instReprNomUtterance = { reprPrec := DegenEtAl2020.instReprNomUtterance.repr }

def DegenEtAl2020.instReprNomUtterance.repr :

NomUtterance → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

@[implicit_reducible]

instance DegenEtAl2020.instInhabitedNomUtterance :

Inhabited NomUtterance

Equations

DegenEtAl2020.instInhabitedNomUtterance = { default := DegenEtAl2020.instInhabitedNomUtterance.default }

@[implicit_reducible]

instance DegenEtAl2020.instFintypeNomUtterance :

Fintype NomUtterance

Equations

One or more equations did not get rendered due to their size.

noncomputable def DegenEtAl2020.φtyp :

NomUtterance → NomWorld → ENNReal

Typicality meaning φ_typ(u, o) ∈ ℝ≥0∞.

Equations

DegenEtAl2020.φtyp DegenEtAl2020.NomUtterance.sub DegenEtAl2020.NomWorld.dalmatian = ENNReal.ofReal (19 / 20)
DegenEtAl2020.φtyp DegenEtAl2020.NomUtterance.sub DegenEtAl2020.NomWorld.cat = ENNReal.ofReal (1 / 100)
DegenEtAl2020.φtyp DegenEtAl2020.NomUtterance.sub DegenEtAl2020.NomWorld.bird = ENNReal.ofReal (1 / 100)
DegenEtAl2020.φtyp DegenEtAl2020.NomUtterance.basic DegenEtAl2020.NomWorld.dalmatian = ENNReal.ofReal (4 / 5)
DegenEtAl2020.φtyp DegenEtAl2020.NomUtterance.basic DegenEtAl2020.NomWorld.cat = ENNReal.ofReal (1 / 20)
DegenEtAl2020.φtyp DegenEtAl2020.NomUtterance.basic DegenEtAl2020.NomWorld.bird = ENNReal.ofReal (1 / 20)
DegenEtAl2020.φtyp DegenEtAl2020.NomUtterance.super DegenEtAl2020.NomWorld.dalmatian = ENNReal.ofReal (7 / 10)
DegenEtAl2020.φtyp DegenEtAl2020.NomUtterance.super DegenEtAl2020.NomWorld.cat = ENNReal.ofReal (7 / 10)
DegenEtAl2020.φtyp DegenEtAl2020.NomUtterance.super DegenEtAl2020.NomWorld.bird = ENNReal.ofReal (7 / 10)

Instances For

noncomputable def DegenEtAl2020.nomL0 (u : NomUtterance) :

Nominal literal listener.

Equations

DegenEtAl2020.nomL0 u = RSA.L0OfMeaning DegenEtAl2020.φtyp u ⋯ ⋯

Instances For

theorem DegenEtAl2020.nomL0_sub :

(nomL0 NomUtterance.sub) NomWorld.dalmatian = ENNReal.ofReal (95 / 97)

L0(dalmatian | "dalmatian") = 95/97 — near-perfect via typicality.

theorem DegenEtAl2020.nomL0_basic :

(nomL0 NomUtterance.basic) NomWorld.dalmatian = ENNReal.ofReal (8 / 9)

L0(dalmatian | "dog") = 8/9 — the basic term discriminates well.

theorem DegenEtAl2020.nomL0_super :

(nomL0 NomUtterance.super) NomWorld.dalmatian = ENNReal.ofReal (1 / 3)

L0(dalmatian | "animal") = 1/3 — no discrimination.

noncomputable def DegenEtAl2020.nomS1 (w : NomWorld) :

PMF NomUtterance

Nominal pragmatic speaker.

Equations

DegenEtAl2020.nomS1 w = RSA.S1Belief DegenEtAl2020.nomL0 (fun (x : DegenEtAl2020.NomUtterance) => 1) 1 w ⋯ ⋯

Instances For

theorem DegenEtAl2020.nominal_overspec_preferred :

(nomS1 NomWorld.dalmatian) NomUtterance.basic < (nomS1 NomWorld.dalmatian) NomUtterance.sub

Nominal overspecification: S1 prefers the subordinate "dalmatian" over the sufficient basic "dog" — the noun analogue of csrsa_overmod_preferred.

theorem DegenEtAl2020.nominal_basic_beats_super :

(nomS1 NomWorld.dalmatian) NomUtterance.super < (nomS1 NomWorld.dalmatian) NomUtterance.basic

The basic "dog" beats the superordinate "animal".

def DegenEtAl2020.φtypBool :

NomUtterance → NomWorld → Bool

Boolean (crisp) typicality.

Equations

Instances For

noncomputable def DegenEtAl2020.nomBoolL0 (u : NomUtterance) :

Boolean nominal literal listener.

Equations

DegenEtAl2020.nomBoolL0 u = RSA.L0OfBoolMeaning DegenEtAl2020.φtypBool u ⋯

Instances For

theorem DegenEtAl2020.nomBoolL0_sub :

(nomBoolL0 NomUtterance.sub) NomWorld.dalmatian = 1

theorem DegenEtAl2020.nomBoolL0_basic :

(nomBoolL0 NomUtterance.basic) NomWorld.dalmatian = 1

noncomputable def DegenEtAl2020.nomBoolS1 (w : NomWorld) :

PMF NomUtterance

Boolean nominal speaker.

Equations

DegenEtAl2020.nomBoolS1 w = RSA.S1Belief DegenEtAl2020.nomBoolL0 (fun (x : DegenEtAl2020.NomUtterance) => 1) 1 w ⋯ ⋯

Instances For

theorem DegenEtAl2020.nom_bool_no_overspec :

¬(nomBoolS1 NomWorld.dalmatian) NomUtterance.basic < (nomBoolS1 NomWorld.dalmatian) NomUtterance.sub

The Boolean model shows no overspecification preference.

The unified mechanism #

Capstone: continuous semantics makes both overmodification (Exp 1) and overspecification (Exp 3) rational, while the Boolean model predicts neither — one mechanism, two phenomena, only the meaning function changes.