[CGGP19] — Incremental Iterated Response Model #
Cohn-Gordon, R., Goodman, N. D., & Potts, C. (2019). An Incremental Iterated Response Model of Pragmatics. Proceedings of the Society for Computation in Linguistics (SCiL) 2, 81–90.
The Model #
The incremental RSA model extends the standard RSA framework to word-by-word production. The speaker produces referring expressions incrementally, choosing each word to maximize the listener's posterior probability for the target:
S1^WORD(wₖ | [w₁,...,wₖ₋₁], r) ∝ L0(r | w₁,...,wₖ)^α
The full utterance probability factors via the chain rule:
S1^UTT-IP(w₁,...,wₙ | r) = ∏ₖ S1^WORD(wₖ | [w₁,...,wₖ₋₁], r)
L0 uses extension-based incremental semantics (§2.2): given prefix c,
⟦c⟧(w) = |{u ∈ U : c ⊑ u ∧ ⟦u⟧(w) = 1}| / |{u ∈ U : c ⊑ u ∧ ∃w'. ⟦u⟧(w') = 1}|
where U is the set of complete utterances and ⊑ is the prefix relation.
Formalization via the IncrementalSemantics bundle #
Each scene in this file is a single value of RSA.IncrementalSemantics U W
(in Pragmatics/RSA/Incremental.lean), specifying just the lexicon
(wordApplies), the closed set of complete utterances, and the world set.
The file derives the chain-rule speaker (α = 1, no cost, uniform priors)
and extension-based L0 from the bundle, so the three scenes (Figure 1, the
[Sed07] reference game, the [rubio-fernandez-2016] display) share
machinery rather than duplicating it.
The bundle exposes a single deep theorem, l0Utt_ge_inv_card, proving
the §2.4 weakly-informative bound generically: any complete utterance true
of r ∈ worlds yields a literal posterior at least 1 / worlds.length.
The Figure 1 application (greedyUnroll_weakly_informative) below
discharges only the r ∈ worlds and uttSem utt r = true premises;
the bound follows.
Main results #
adj_first_for_target/noun_after_adj/noun_only_for_r2/adj_only_for_r3: the Figure 1c word-by-word speaker rows.uniform_after_red_for_r2: the §2.2 dead-end fallback (Figure 1c's R2 row: 0, ½, ½ over words with viable continuations).listener_anticipation: hearing "red", L1 favours R3 at 7/11 (Figure 1d) — the anticipatory implicature.incremental_prefers_bare_noun: the chain-rule product prefers bare "dress" (3/7) over "red dress" (8/21) for R1 (Figure 1e), diverging fromglobal_prefers_red_dress— the paper's central architectural wedge.sedivy_contrast_inference/cgSedivyLooks_satisfy_contrast_reduces_competitor: the §3.2 contrastive-inference reanalysis, discharging the visual-world paradigm contract under the Bayesian posterior linking hypothesis.
Implementation notes #
Extension counts are natural numbers, so the chain is exact ℚ≥0 with PMF
agents via PMF.ofScores. The §2.2 dead-end fallback distributes over
words with viable continuations, gated to scene referents (out-of-scene
referents keep their zero row). Utterance-level probabilities are
chain-rule products of s1 values (eq. 7).
Domain Types (Figure 1a) #
Equations
- CohnGordonEtAl2019.instDecidableEqWord x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- CohnGordonEtAl2019.instReprWord = { reprPrec := CohnGordonEtAl2019.instReprWord.repr }
Equations
- CohnGordonEtAl2019.instDecidableEqReferent x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- CohnGordonEtAl2019.instReprReferent = { reprPrec := CohnGordonEtAl2019.instReprReferent.repr }
The Figure 1 bundle #
The Figure 1 reference scene as an IncrementalSemantics bundle:
three words ("red", "dress", "object"), three complete utterances
("dress", "red dress", "red object"), three referents (R1, R2, R3).
Equations
- One or more equations did not get rendered due to their size.
Instances For
The kernel face over the bundle #
ℚ extension semantics ⟦pfx⟧(r) (§2.2), mirroring
IncrementalSemantics.incrementalSem.
Equations
- CohnGordonEtAl2019.incSemScore sem pfx r = ↑(sem.trueExtCount pfx r) / ↑(sem.viableExtCount pfx)
Instances For
Word-level literal listener value (eq. 4).
Equations
- CohnGordonEtAl2019.l0Score sem ctx u r = CohnGordonEtAl2019.incSemScore sem (ctx ++ [u]) r / ∑ r' : W, CohnGordonEtAl2019.incSemScore sem (ctx ++ [u]) r'
Instances For
Word-level speaker score at context ctx (eq. 5, zero cost, α = 1).
At dead-end cells — no word makes the referent reachable — probability is
"evenly distributed" over the words with viable continuations (§2.2,
Figure 1c's R2 row: 0, ½, ½).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Normalized incremental speaker.
Equations
- CohnGordonEtAl2019.s1Post sem ctx r = PMF.normalizeScores (CohnGordonEtAl2019.s1Score sem ctx r)
Instances For
Word-by-word speaker at context ctx (eq. 5).
Equations
- CohnGordonEtAl2019.s1 sem ctx r = PMF.ofScores PMF.Fallback.uniform (CohnGordonEtAl2019.s1Score sem ctx r)
Instances For
Pragmatic listener upon the first word (eq. 6, uniform priors).
Equations
- CohnGordonEtAl2019.l1 sem u = PMF.ofScores PMF.Fallback.uniform fun (r : W) => CohnGordonEtAl2019.s1Post sem [] r u
Instances For
The incremental chain #
Predictions #
Figure 1c: the word-by-word speaker #
For R1 the speaker leads with "red" (4/7 > 3/7): both red referents stay viable, while "dress" dilutes L0 over the two dresses.
After "red", the speaker completes with "dress" (2/3 > 1/3): "red dress" is unique to R1, "red object" ambiguous with R3.
For R2 (blue dress) the speaker must start with "dress": no extension of "red" is true of R2.
For R3 (red hat) the speaker must start with "red": "dress" is false of R3.
The §2.2 dead end: after "red" for R2 nothing is true, and "probability is evenly distributed over all choices of word" — S1 is uniform (½, ½) over the viable continuations.
Figure 1d: the pragmatic listener #
The anticipatory implicature: hearing "red", L1 favours R3 over R1 (7/11 > 4/11) — "red" is R3's only option, while R1's speaker had alternatives. The §3.2 [STCC99] bridge below builds on this; the authors cite [Sed07] for the effect.
Figure 1e: utterance-level probabilities #
The architectural wedge (Figure 1e): the chain-rule product prefers bare "dress" (3/7) over "red dress" (4/7 · 2/3 = 8/21) for R1, while the global model prefers the more informative "red dress".
§2.4 Weakly-Informative Greedy Unrolling #
§2.4's weakly informative bound — greedy unrolling reaches a complete
utterance whose literal posterior for the target is at least 1/|W| — is
proved generically as l0Utt_ge_inv_card in Incremental.lean; here we
define the Figure 1 unroller and discharge its premises.
Greedy unrolling for Figure 1's scene: at each step pick the word maximizing L0(r | ctx ++ [w]); stop when ctx is a complete utterance. Tabulated by case for the three Figure 1 referents.
Equations
- CohnGordonEtAl2019.greedyUnroll CohnGordonEtAl2019.Referent.redDress = [CohnGordonEtAl2019.Word.red, CohnGordonEtAl2019.Word.dress]
- CohnGordonEtAl2019.greedyUnroll CohnGordonEtAl2019.Referent.blueDress = [CohnGordonEtAl2019.Word.dress]
- CohnGordonEtAl2019.greedyUnroll CohnGordonEtAl2019.Referent.redHat = [CohnGordonEtAl2019.Word.red, CohnGordonEtAl2019.Word.object]
Instances For
§2.4 weakly informative bound, instantiated for Figure 1.
Each of the three greedy outputs is a complete utterance true of its
target referent, so the generic l0Utt_ge_inv_card theorem from
Incremental.lean immediately gives the 1/|worlds| = 1/3 bound. The
actual values for this scene are 1, 1/2, 1/2 — the bound is loose here
by design: it certifies architectural sanity, not optimality.
Global RSA Model + Divergence (§2.4) #
The global RSA model treats each complete utterance as an atomic
option, normalizing over the whole utterance space rather than chaining
word-by-word. The divergence between global and incremental predictions
for R1 is a central result of [CGGP19] §2.4:
the global model prefers the more-informative "red dress" over the bare
"dress" (standard RSA Q-implicature), but the incremental model prefers
"dress" because chain-rule products penalize longer trajectories
(Finding 7, incremental_prefers_bare_noun).
Equations
- CohnGordonEtAl2019.instDecidableEqUtterance x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Boolean truth of a complete utterance for a referent.
Equations
- CohnGordonEtAl2019.uttApplies CohnGordonEtAl2019.Utterance.dress CohnGordonEtAl2019.Referent.redDress = true
- CohnGordonEtAl2019.uttApplies CohnGordonEtAl2019.Utterance.dress CohnGordonEtAl2019.Referent.blueDress = true
- CohnGordonEtAl2019.uttApplies CohnGordonEtAl2019.Utterance.redDress CohnGordonEtAl2019.Referent.redDress = true
- CohnGordonEtAl2019.uttApplies CohnGordonEtAl2019.Utterance.redObject CohnGordonEtAl2019.Referent.redDress = true
- CohnGordonEtAl2019.uttApplies CohnGordonEtAl2019.Utterance.redObject CohnGordonEtAl2019.Referent.redHat = true
- CohnGordonEtAl2019.uttApplies x✝¹ x✝ = false
Instances For
Global literal listener value (eq. 1): indicator over referents.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Global speaker (eq. 2 at zero cost, α = 1): renormalizes L0 over the three atomic utterances.
Equations
Instances For
Divergence from incremental (§2.4): the global RSA prefers the
fully-modified "red dress" over the bare "dress" for R1, because
"red dress" uniquely identifies R1 while "dress" leaves R1/R2
ambiguous. Compare Finding 7 (incremental_prefers_bare_noun),
where the incremental trajectory probability has the opposite
preference: chain-rule products discount longer trajectories enough
to flip the ordering. This is the central empirical wedge between
the two architectures the paper articulates.
The §3.2 contrastive-inference bridge #
§3.2 reanalyses [Sed07]'s review (the effect is
[STCC99]'s) in incremental RSA: hearing bare "tall", L1 prefers
the tall cup over the tall pitcher — extensionally both fit, but a
pitcher-referring speaker had no use for "tall". Both contrast cells are
formalized: the paper's four-referent sedivyBundle and a no-contrast
companion (SedivyScene_NoContrast.bundle, dropping the absent short
cup's word); the shared Referent type lets one Cell-typed
LookProportion read off both.
Equations
- CohnGordonEtAl2019.SedivyScene.instDecidableEqWord x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- CohnGordonEtAl2019.SedivyScene.instDecidableEqReferent x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
The Sedivy scene as an IncrementalSemantics bundle: 5 words,
6 complete utterances (3 adj+noun phrases + 3 bare nouns; the
bare-noun option is essential — without it "tall" is no longer
diagnostic of the cup), 4 referents.
Equations
- One or more equations did not get rendered due to their size.
Instances For
No-contrast companion scene #
No-contrast variant of the Sedivy scene, sharing
SedivyScene.Referent but with a smaller word inventory. Empirically
this is the no-contrast cell of [STCC99]'s 2 × 2 × 2
design: the same-category contrast object (the short cup) is removed
from the visual display.
The companion bundle drops .short from Word and the [short, cup]
utterance from completeUtterances. Justification: with no shortCup
in the display, a cooperative speaker has no scene-anchored use for
.short, and the paper's IncrementalSemantics is a scene-specific
production model rather than a lexicon-wide one. (The listener's
standing mental lexicon still contains short; the bundle here is a
model of speaker production for this scene, not of mental
inventories.)
Why the fresh Word type rather than a {sedivyBundle with worlds := …}
update? Keeping .short in the lexicon while removing all of its
referents leaves incrementalSem [.short] _ = 0/0, which mathlib treats
as 0 but which kernel evaluation of the ℚ face cannot reduce.
The fresh type sidesteps the divide-by-zero pattern at the cost of mild
bundle duplication.
Equations
- CohnGordonEtAl2019.SedivyScene_NoContrast.instDecidableEqWord x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Cohn-Gordon §3.2 prediction: after hearing "tall", L1 favours the tall cup over the tall pitcher (3/5 vs 2/5). The mechanism is the contrastive inference: a speaker referring to the pitcher would use "pitcher" alone (S1(pitcher | tallPitcher) = 2/3); the only referent for which "tall" is the speaker's preferred first word is the tall cup, where "cup" alone leaves shortCup ambiguous.
This formalises [Sed07]'s anticipatory contrast effect
within the incremental RSA framework (and indirectly captures the
[STCC99] empirical pattern Sedivy 2007 reviews).
The paradigm-level statement (Sedivy Pattern 2,
VisualWorld.ContrastReducesCompetitorLooks) requires a
contrast vs no-contrast comparison; this theorem captures the
contrast-condition direction only.
Cell-typed look projection for the Sedivy paradigm under the incremental-RSA model.
Linking hypothesis (load-bearing, editorial) #
The incremental RSA model produces a posterior over referents,
L1 : Word → Referent → ℝ. Visual-world data are fixation
proportions. Mapping the former to the latter requires a linking
hypothesis. This file makes the simplest one explicit:
Bayesian posterior linking hypothesis — the proportion of looks to an object equals the listener's posterior probability of that object being the referent at the same point in the unfolding utterance.
[CGGP19] do not state this assumption;
they discuss the contrastive-inference effect at the level of L1
posteriors and treat empirical contact with [Sed07]'s
look data informally. The Bayesian linking hypothesis used here is
the strongest natural choice given a single normalised posterior.
A weaker alternative would be a Luce-choice rule over a
monotone-in-posterior activation; that weakening preserves the
qualitative inequality patterns this file proves. If a second linking
hypothesis enters the codebase, the paradigm contract should grow a
typed LinkingHypothesis API and
the bridge theorem statement should mention which hypothesis is
in force.
Construction #
cgSedivyLooks role c selects the appropriate scene config based on
c.contrast (incRSA_sedivy for the contrast cell;
SedivyScene_NoContrast.incRSA_sedivy_noContrast for the no-contrast
cell) and reads off L1 .tall · at the referent corresponding to
role. Other factors of the cell (typicality, task) are ignored —
the incremental RSA model has no internal representation of
typicality or task, so the projection is constant in those factors.
Cells in the contrast condition cover four roles; cells in the
no-contrast condition omit .contrastingObject (no shortCup is on
display) and so collapse to 0 for that role.
Equations
- One or more equations did not get rendered due to their size.
- CohnGordonEtAl2019.cgSedivyLooks role✝ c = 0
Instances For
Paradigm Pattern 2 verified for Cohn-Gordon's incremental RSA:
swapping the contrast factor from contrast to noContrast
strictly increases looks to the cross-category competitor (the tall
pitcher), under the Bayesian posterior linking hypothesis stated on
cgSedivyLooks.
Mechanism: in the contrast scene, L1(.tall, tallPitcher) = 2/5
because a speaker referring to the pitcher would prefer "pitcher"
alone (the shortCup distractor pulls "tall" toward the cup). In
the no-contrast scene there is no shortCup, "tall" is uninformative
between the two extant scale-pole objects, and L1(.tall, tallPitcher) = 1/2.
Discharges
SedivyEtAl1999.SatisfiesSedivyPattern.contrast_reduces_competitor_looks
for this model. The proof reduces — via the HasContrastCondition
lens applied to a destructured cell — to the per-cell L1 inequality,
kernel-verified on the exact-ℚ face.
Rubio-Fernández §3.1 Bridge (English Over-Modification, STOP token) #
[CGGP19] §3.1 reanalyses
[rubio-fernandez-2016]'s finding that English speakers
over-modify (saying "the red dress" when "the dress" suffices in a
display with one dress). The mechanism: an explicit STOP token
marks the end of the utterance, so trajectories of different lengths
([dress, STOP] vs [red, dress, STOP]) become directly comparable
under the chain rule. Without STOP, the chain rule penalizes longer
trajectories monotonically (Finding 7); with STOP and a per-word
cost the over-modification preference can emerge in the right cost
regime.
This formalisation establishes the model's lexicon and complete-
utterance set with STOP, and proves the structural invariants
(every complete utterance ends in STOP; STOP does not apply
mid-utterance). The cost-dependent comparison theorem
S1^UTT-IP(red dress, STOP | R1) > S1^UTT-IP(dress, STOP | R1) is
left as future work — formalising it requires Real.exp over a cost
schedule and a quantitative argument that does not reduce via
kernel comparison.
Equations
- CohnGordonEtAl2019.RubioFernandezScene.instDecidableEqWord x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Display referents: a red dress and a blue hat, the canonical minimal pair from [rubio-fernandez-2016]'s display.
Instances For
Equations
- CohnGordonEtAl2019.RubioFernandezScene.instDecidableEqReferent x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The Rubio-Fernández scene as an IncrementalSemantics bundle:
five words including stop, four complete utterances all ending
in stop, two referents. stop does not apply to any referent —
it is a structural marker, not a content word.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Every complete utterance terminates with stop. This is the
structural invariant the STOP machinery enforces.
stop never applies to any referent — it is a structural marker,
not a content word. This is what makes a STOP-augmented utterance
u ++ [.stop] veridically equivalent to the underlying content
sequence u.