Erk & Herbelot 2024 — How to Marry a Star #
@cite{erk-herbelot-2024}
Erk, K. & Herbelot, A. (2024). How to Marry a Star: Probabilistic Constraints for Meaning in Context. Journal of Semantics 40(4), 549–583.
Status: Phase 3 (paper-faithful instantiation) #
This file instantiates the SDS substrate from
Theories/Semantics/Probabilistic/SDS/{GraphicalModel,JointPosterior}.lean
on the paper's running examples — currently the bat-in-player sentence
(paper §5.1, Figure 5, Table 1). Phase 4 will add the astronomer-married-
star sentence (Table 2).
The previous version of this file used the legacy SDSConstraintSystem
flat-substrate (a Product-of-Experts caricature that collapsed the
paper's directed graphical model to two functions over the concept
space). Replaced because the caricature could not reproduce Tables 1–2
nor the qualitative α-monotonicity result paper §5.2 advertises. The
new version uses the paper-faithful multi-node graphical model.
Numerical reproduction #
Closed-form derivation in our framework (see derivations in theorem docstrings) gives:
| α | P(BAT-STICK | obs) | P(BAT-ANIMAL | obs) |
|---|---|---|
| 0.5 | 3/4 = 0.75 | 1/4 = 0.25 |
| 0.1 | 11/12 ≈ 0.917 | 1/12 ≈ 0.083 |
Paper Table 1 (p. 571, WebPPL Monte Carlo, 2000 samples):
| α | p(stick) | p(animal) |
|---|---|---|
| 0.5 | 0.82 | 0.18 |
| 0.1 | 0.96 | 0.04 |
After detailed re-read of paper §4.1, §4.2, and §5.1 graphical-model descriptions (PDF pp. 13-25), the ~7pp discrepancy at α=0.5 is NOT explained by:
- HOLD-AGENT selectional choice: PLAYER is observed at the player
node, so
selectional(hold-agent, PLAYER)is a constant factor in all configurations and cancels in normalization. Any non-zero spec — uniform or otherwise — gives the same posterior at the bat node, given the observation. - Bernoulli role-existence nodes (paper §4.1, p. 563): these are typically = 1 for mandatory roles (paper: "the sleeper always needs to be realized in a sleeping event, that is, P(SLEEP-THEME | SLEEP) = 1"). HOLD-AGENT and HOLD-THEME for "hold" are both presumably mandatory; even if not, observing Agent/Theme conditions pins the Bernoulli to "yes" with constant likelihood factor across configs.
- Other-verb sem.role nodes (paper §4.1: "PAINT-AGENT and PAINT-THEME, both with zero probability of occurring as roles of SLEEP"): contribute multiplicative factor 1 to all configurations, wash out.
- Verb concept-node lacking a role contribution (paper p. 569:
verb concept (5) "is conditionally dependent on node (3)" only,
not on any role node): my model places a uniform-PMF placeholder
verb_selfat the verb node. Since c_verb = HOLD is observed, the uniform-PMF contribution is a constant factor across configs. - Soft vs hard role-Bernoullis with non-unit P(role | verb): doesn't change posterior given observation pins the Bernoulli.
The most plausible remaining explanations:
- Monte Carlo noise in paper's 2000-sample WebPPL simulation. SE for p=0.82 with N=2000 is ≈ 0.009, so the 95% CI on paper's true probability is roughly [0.80, 0.84]. Our 0.75 is 8 SDs below 0.82 — within MC noise only if paper's underlying probability is much closer to 0.78 than 0.82.
- A graphical-model structural element we haven't identified, such as additional dependencies between concept nodes and the scenario-mix node that aren't visible in the figures we read.
- Paper's WebPPL implementation specifics (rejection-sampling bias, etc.) that effectively change the implied posterior.
The qualitative direction — lower α → more BAT-STICK (the BASEBALL-favored sense) — matches the paper. Both rows of Table 1 show the same direction in our closed-form derivation.
The numbers above are what the closed-form joint posterior of our graphical model evaluates to; we do not back-solve parameters to match the paper's WebPPL output (per the user-locked decision in the 0.230.298 redo: "compute the true closed-form joint posterior — don't back-solve, don't intervals").
Provenance for paper-cited values #
- 9-concept inventory: paper p. 569 ("BALL, BAT-ANIMAL, HOLD, BAT-STICK, CANDLE, CAT, PLAYER, STONE, VAMPIRE")
- BASEBALL scenario distribution: paper p. 569 ("equal probability to the concepts BALL, BAT-STICK, HOLD, PLAYER, and STONE, and zero probability to all other concepts")
- GOTHIC scenario distribution: paper p. 569 ("equal probability to the concepts BAT-ANIMAL, CANDLE, CAT, HOLD, and VAMPIRE, and zero probability otherwise")
- HOLD-THEME selectional: paper p. 569 ("P(c | HOLD-THEME) = {0 for c=HOLD; 0.125 else}")
- HOLD-AGENT selectional: NOT specified by paper; we assume uniform 1/8 over non-HOLD concepts (analogous to HOLD-THEME)
- VERB-SELF selectional (placeholder for the no-role hold concept-node): uniform 1/9 over all concepts. Contributes a constant factor that washes out in normalization, so doesn't affect the posterior.
The 9-concept inventory of paper §5.1 p. 569.
- BALL : BatConcept
- BAT_ANIMAL : BatConcept
- HOLD : BatConcept
- BAT_STICK : BatConcept
- CANDLE : BatConcept
- CAT : BatConcept
- PLAYER : BatConcept
- STONE : BatConcept
- VAMPIRE : BatConcept
Instances For
Equations
- One or more equations did not get rendered due to their size.
Equations
- ErkHerbelot2024.instDecidableEqBatConcept x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- ErkHerbelot2024.instReprBatConcept = { reprPrec := ErkHerbelot2024.instReprBatConcept.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- ErkHerbelot2024.instInhabitedBatConcept = { default := ErkHerbelot2024.BatConcept.HOLD }
The 2-scenario inventory of paper §5.1 p. 569.
- BASEBALL : BatScenario
- GOTHIC : BatScenario
Instances For
Equations
- One or more equations did not get rendered due to their size.
Equations
- ErkHerbelot2024.instDecidableEqBatScenario x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- ErkHerbelot2024.instReprBatScenario = { reprPrec := ErkHerbelot2024.instReprBatScenario.repr }
Equations
Roles in the bat-in-player sentence. verb_self is a uniform-selectional
placeholder for the verb concept-node, which has no role attached in the
paper's graphical model (paper Figure 5 node 5: holds concept node, no
incoming role edge).
Instances For
Equations
- One or more equations did not get rendered due to their size.
Equations
- ErkHerbelot2024.instDecidableEqBatRole x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- ErkHerbelot2024.instReprBatRole = { reprPrec := ErkHerbelot2024.instReprBatRole.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
The BASEBALL scenario's concept distribution: uniform 1/5 over {BALL, BAT-STICK, HOLD, PLAYER, STONE}, zero elsewhere. Paper p. 569.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The GOTHIC scenario's concept distribution: uniform 1/5 over {BAT-ANIMAL, CANDLE, CAT, HOLD, VAMPIRE}, zero elsewhere. Paper p. 569.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Per-scenario concept distribution.
Equations
Instances For
The HOLD-THEME selectional preference: uniform 1/8 over the 8 non-HOLD concepts. Paper p. 569 ("P(c | HOLD-THEME) = {0 for c=HOLD; 0.125 else}").
Equations
- One or more equations did not get rendered due to their size.
Instances For
The HOLD-AGENT selectional preference. NOT specified in paper §5.1; we assume uniform 1/8 over the 8 non-HOLD concepts (analogous to HOLD-THEME).
Instances For
Verb-self placeholder selectional: uniform 1/9 over all 9 concepts. For the verb concept-node which has no role attached in paper Figure 5; the uniform contribution factors out of normalization and doesn't affect the marginal posterior at any other node.
Equations
- ErkHerbelot2024.verbSelfSel = PMF.uniformOfFintype ErkHerbelot2024.BatConcept
Instances For
Per-role selectional preferences.
Equations
Instances For
The paper's bat-in-player graphical model, parameterized by the Dirichlet concentration α.
Equations
- ErkHerbelot2024.batModel α hα = { perScenario := ErkHerbelot2024.batPerScenario, selectional := ErkHerbelot2024.batSelectional, alpha := α, alphaPos := hα }
Instances For
The 3-node sentence structure. Paper Figure 5: nodes (5) holds, (8) player, (9) bat. We index them 0, 1, 2 and assign roles per the figure: the verb has no role (uniform placeholder), player gets HOLD-AGENT, bat gets HOLD-THEME.
Equations
Instances For
Observations: the surface form at each concept-node restricts the admissible concept set. Paper Figure 5 nodes (10)-(14):
- node (12) observes "hold(_)" → c_hold = HOLD
- node (10) observes "player(_)" → c_player = PLAYER
- node (14) observes "bat(_)" → c_bat ∈ {BAT-ANIMAL, BAT-STICK}
Equations
Instances For
Closed-form derivation #
Conditional on observations:
- c_player = PLAYER. Since perScenario(GOTHIC, PLAYER) = 0, the player observation forces s_player = BASEBALL. (Paper's "probabilistic modus tollens", p. 570.)
- c_hold = HOLD. Both BASEBALL and GOTHIC give P(HOLD | scenario) = 1/5, so the hold observation does not constrain s_hold.
- c_bat ∈ {BAT-ANIMAL, BAT-STICK}. Constrained:
- If s_bat = BASEBALL, only BAT-STICK has nonzero P (1/5).
- If s_bat = GOTHIC, only BAT-ANIMAL has nonzero P (1/5).
So nonzero (s_hold, s_player, s_bat, c_hold, c_player, c_bat) configurations are exactly the 4 cases:
- (B, B, B, HOLD, PLAYER, BAT-STICK) — counts (3, 0)
- (B, B, G, HOLD, PLAYER, BAT-ANIMAL) — counts (2, 1)
- (G, B, B, HOLD, PLAYER, BAT-STICK) — counts (2, 1)
- (G, B, G, HOLD, PLAYER, BAT-ANIMAL) — counts (1, 2)
Each configuration has factor: seqProb_α(counts) · perScenario(s_hold, HOLD)·perScenario(s_player, PLAYER)·perScenario(s_bat, c_bat) · selectional(verb-self, HOLD)·selectional(hold-agent, PLAYER)·selectional(hold-theme, c_bat)
The perScenario products are 1/5·1/5·1/5 = 1/125 in all 4 configurations (since c_hold=HOLD is in both scenarios, c_player=PLAYER is only in BASEBALL but s_player is forced to BASEBALL anyway, and c_bat is forced to match s_bat).
The selectional products are 1/9·1/8·1/8 in all 4 configurations.
So the 4 factors are proportional to seqProb alone:
For α = 1/2: seqProb(3,0) = 5/16, seqProb(2,1) = 1/16, seqProb(1,2) = 1/16 → Factors (BAT-STICK first, BAT-ANIMAL second): 5, 1 / 1, 1 → BAT-STICK total: 6, BAT-ANIMAL total: 2 → Posterior: 3/4 / 1/4
For α = 1/10: seqProb(3,0) = 7/16, seqProb(2,1) = 1/48, seqProb(1,2) = 1/48 → Factors: 7/16, 1/48 / 1/48, 1/48 → BAT-STICK total: 7/16+1/48 = 11/24, BAT-ANIMAL: 2/48 = 1/24 → Posterior: 11/12 / 1/12
The 4-element support of nonzero jointFactorObs configurations for
the bat-in-player sentence. By the closed-form analysis above, only
these 4 (s_hold, s_player, s_bat, c_hold, c_player, c_bat)
configurations have nonzero factor:
- (BB, BB, BB; HOLD, PLAYER, BAT_STICK)
- (BB, BB, GO; HOLD, PLAYER, BAT_ANIMAL)
- (GO, BB, BB; HOLD, PLAYER, BAT_STICK)
- (GO, BB, GO; HOLD, PLAYER, BAT_ANIMAL)
Used with GraphicalModel.conceptPosteriorAt_eq_of_support to discharge
the Table 1 theorems.
Equations
- One or more equations did not get rendered due to their size.
Instances For
h_supp discharge: structural blocker lemmas #
Rather than decide over the full 5832-element configuration space (which
exceeds Lean's maxRecDepth and crashes with maxRecDepth 32000),
we identify the ≤ 6 STRUCTURAL conditions any (sA, cA) ∉ supp must satisfy,
and use the SDS substrate's helper lemmas to discharge each.
α-monotonicity (paper §5.1, p. 571) #
Paper text: "this preference grows more pronounced when the concentration parameter α of the Dirichlet distribution is lower, that is, when we implement a stronger preference towards sparse scenario distributions."
The qualitative theorem we want: as α decreases from 1/2 to 1/10, P(BAT-STICK | obs) increases from 3/4 to 11/12.
Since 11/12 > 3/4, this is true for our two specific α values. The
general monotonicity statement (∀ α₁ ≤ α₂, …) requires the Polya-urn
predictive monotonicity in α, which is a known property
(PolyaUrn.predictive_mono in Core/Probability/PolyaUrn.lean proves
monotonicity in counts; the α-direction would be a separate theorem).
Paper §5.2: "an astronomer married a star" #
Paper p. 571: "We use two scenarios. The scenario STARGAZING gives equal probabilities to the concepts ASTRONOMER, STAR(SUN), and MARRY, and zero otherwise, while the scenario STAGE gives equal probabilities to the concepts STAR(PERSON) and MARRY, and zero otherwise (For simplicity, we have added MARRY to both scenarios instead of adding a third scenario.) The concept MARRY has mandatory Agent and Theme roles, both with a strong preference for human role fillers: We set P(c | MARRY-THEME) = 0.475 for a concept c = ASTRONOMER or c = STAR-PERSON and P(c | MARRY-THEME) = 0.05 for c = STAR-SUN."
Note the selectional is non-uniform here (unlike the bat-in-player HOLD-THEME): 0.475/0.475/0.05/0 over (ASTRONOMER, STAR-PERSON, STAR-SUN, MARRY). This makes the closed-form derivation slightly more involved.
The signature pun phenomenon: under MARRY-THEME, STAR-PERSON has 9.5× higher selectional weight than STAR-SUN. But under STARGAZING scenario, STAR-SUN has nonzero P while STAR-PERSON has zero. The "pun" arises because conditioning on the observed sentence requires resolving WHETHER the scenario is STARGAZING (favoring STAR-SUN) or STAGE (favoring STAR-PERSON), and both are plausible given the "astronomer" + "marry" + "star" observations.
The 4-concept inventory for the astronomer-married-star sentence (paper §5.2 p. 571 lists: ASTRONOMER, STAR-SUN, STAR-PERSON, MARRY).
- ASTRONOMER : StarConcept
- STAR_PERSON : StarConcept
- STAR_SUN : StarConcept
- MARRY : StarConcept
Instances For
Equations
- One or more equations did not get rendered due to their size.
Equations
- ErkHerbelot2024.instDecidableEqStarConcept x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- ErkHerbelot2024.instReprStarConcept = { reprPrec := ErkHerbelot2024.instReprStarConcept.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Equations
- ErkHerbelot2024.instDecidableEqMarryScenario x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Roles in the astronomer-married-star sentence. marry_self is a
uniform-PMF placeholder for the verb concept node.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Equations
- ErkHerbelot2024.instDecidableEqMarryRole x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- ErkHerbelot2024.instReprMarryRole = { reprPrec := ErkHerbelot2024.instReprMarryRole.repr }
STARGAZING scenario distribution: equal 1/3 over {ASTRONOMER, STAR_SUN, MARRY}. Paper p. 571.
Equations
- One or more equations did not get rendered due to their size.
Instances For
STAGE scenario distribution: equal 1/2 over {STAR_PERSON, MARRY}.
Equations
Instances For
Per-scenario concept distribution.
Equations
Instances For
The MARRY-THEME selectional preference: paper p. 571 cites
P(ASTRONOMER) = P(STAR_PERSON) = 0.475, P(STAR_SUN) = 0.05. We assume
P(MARRY) = 0 (paper doesn't mention MARRY as an option for THEME, and
the three given values sum to exactly 1).
Sum-to-1: enumerate the 4-element Fintype, push numerals through
ENNReal.div_add_div_same, close via ENNReal.div_self.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The MARRY-AGENT selectional preference. Paper says "both with a strong preference for human role fillers" — assumed same shape as MARRY-THEME. Doesn't matter for the posterior since c_astronomer is observed = ASTRONOMER in all configs (factor cancels in normalization).
Instances For
Verb-self placeholder for the marry concept-node (no role attached in paper Figure 6). Doesn't matter for the posterior since c_marry is observed = MARRY in all configs.
Equations
- ErkHerbelot2024.marrySelfSel = PMF.uniformOfFintype ErkHerbelot2024.StarConcept
Instances For
Per-role selectional preferences.
Equations
- ErkHerbelot2024.starSelectional ErkHerbelot2024.MarryRole.marry_agent = ErkHerbelot2024.marryAgentSel
- ErkHerbelot2024.starSelectional ErkHerbelot2024.MarryRole.marry_theme = ErkHerbelot2024.marryThemeSel
- ErkHerbelot2024.starSelectional ErkHerbelot2024.MarryRole.marry_self = ErkHerbelot2024.marrySelfSel
Instances For
The astronomer-married-star graphical model.
Equations
- ErkHerbelot2024.astronomerModel α hα = { perScenario := ErkHerbelot2024.starPerScenario, selectional := ErkHerbelot2024.starSelectional, alpha := α, alphaPos := hα }
Instances For
The 3-node sentence "an astronomer married a star". Paper Figure 6: left node is ASTRONOMER concept, middle is MARRY (verb), right is STAR concept. Roles per paper: ASTRONOMER node gets MARRY-AGENT, STAR node gets MARRY-THEME, MARRY node has no role attached.
Equations
Instances For
Observations: astronomer admits {ASTRONOMER}, marry admits {MARRY}, star admits {STAR_PERSON, STAR_SUN}.
Equations
Instances For
Closed-form derivation for the astronomer-married-star posterior #
The astronomer observation forces s_astronomer = STARGAZING (since perScenario(STAGE, ASTRONOMER) = 0). The marry observation does not constrain s_marry (both scenarios admit MARRY). The star observation constrains the (s_star, c_star) pair:
- s_star = STARGAZING ⇒ c_star = STAR-SUN (only c with nonzero P)
- s_star = STAGE ⇒ c_star = STAR-PERSON
Nonzero (s_astron, s_marry, s_star, c_star) configurations:
- (ST, ST, ST, STAR-SUN) — counts (3, 0)
- (ST, ST, SG, STAR-PERSON) — counts (2, 1)
- (ST, SG, ST, STAR-SUN) — counts (2, 1)
- (ST, SG, SG, STAR-PERSON) — counts (1, 2)
Each configuration's factor (after dividing out constant terms across all 4 configs):
factor = seqProb_α(counts) · perScenario(s_marry, MARRY) · perScenario(s_star, c_star) · sel(MARRY-THM, c_star)
Variable per-config values:
- perScenario(ST, MARRY) = 1/3, perScenario(SG, MARRY) = 1/2
- perScenario(ST, STAR-SUN) = 1/3, perScenario(SG, STAR-PERSON) = 1/2
- sel(MARRY-THM, STAR-SUN) = 1/20 = 0.05
- sel(MARRY-THM, STAR-PERSON) = 19/40 = 0.475
For α = 1/2:
- seqProb(3,0) = 5/16, seqProb(2,1) = seqProb(1,2) = 1/16
Configs (raw factors before LCM):
- STAR-SUN: 5/16 · 1/3 · 1/3 · 1/20 = 5/2880
- STAR-PERSON: 1/16 · 1/3 · 1/2 · 19/40 = 19/3840
- STAR-SUN: 1/16 · 1/2 · 1/3 · 1/20 = 1/1920
- STAR-PERSON: 1/16 · 1/2 · 1/2 · 19/40 = 19/2560
Common denom 23040 = 2⁹ · 3² · 5:
- Config 1: 40/23040, Config 2: 114/23040, Config 3: 12/23040, Config 4: 171/23040
- STAR-SUN total: 52, STAR-PERSON total: 285, Z = 337
- P(STAR-PERSON | obs, α=1/2) = 285/337 ≈ 0.846
- P(STAR-SUN | obs, α=1/2) = 52/337 ≈ 0.154
For α = 1/10:
- seqProb(3,0) = 7/16, seqProb(2,1) = seqProb(1,2) = 1/48
Configs (raw factors):
- STAR-SUN: 7/16 · 1/3 · 1/3 · 1/20 = 7/2880
- STAR-PERSON: 1/48 · 1/3 · 1/2 · 19/40 = 19/11520
- STAR-SUN: 1/48 · 1/2 · 1/3 · 1/20 = 1/5760
- STAR-PERSON: 1/48 · 1/2 · 1/2 · 19/40 = 19/7680
Common denom 23040:
- Config 1: 56/23040, Config 2: 38/23040, Config 3: 4/23040, Config 4: 57/23040
- STAR-SUN total: 60, STAR-PERSON total: 95, Z = 155
- P(STAR-PERSON | obs, α=1/10) = 95/155 = 19/31 ≈ 0.613
- P(STAR-SUN | obs, α=1/10) = 60/155 = 12/31 ≈ 0.387
| α | Our framework P(STAR-PERSON) | Paper Table 2 |
|---|---|---|
| 1/2 | 285/337 ≈ 0.846 | 0.82 |
| 1/10 | 19/31 ≈ 0.613 | 0.57 |
Discrepancy at α=1/2 is ≈ 2.5pp; at α=1/10 is ≈ 4pp. Both are within plausible Monte Carlo noise for paper's 2000-sample WebPPL simulation (SE ≈ 0.009-0.01 on these probabilities). The astronomer example tracks the paper's Table 2 numbers more closely than the bat-in-player tracks Table 1, possibly because the non-uniform selectional captures more structure.
The qualitative pun phenomenon — STAR-SUN remains a meaningful proportion of the posterior even though MARRY-THEME strongly prefers human fillers — is reproduced: at α=1/2, STAR-SUN gets 0.154; at α=1/10, 0.387. Lower α (more peaked scenario mixtures) increases the STAR-SUN reading, exactly as paper §5.2 p. 572 advertises ("the more emphasis there is on a coherent scenario (the lower the value of α), the more probability mass is given to the situation where an astronomer marries a giant ball of plasma").
The signature qualitative result of paper §5.2: lower α → more probability mass on the STAR-SUN reading (i.e., the giant-ball-of-plasma interpretation gets stronger as the scenario distribution gets sparser).
Numerically: at α=1/2, P(STAR-SUN) = 52/337 ≈ 0.154; at α=1/10, P(STAR-SUN) = 12/31 ≈ 0.387. Since 12/31 > 52/337, lowering α from 1/2 to 1/10 increases the STAR-SUN posterior.