[TG19]: The Language of Generalization #
Psychological Review, 126(3), 395–436.
Core Insight #
Generics ("Robins lay eggs") use the SAME uncertain threshold semantics as gradable adjectives. The scale is prevalence rather than height/degree:
⟦gen⟧(p, θ) = 1 if prevalence p > threshold θ
This IS positiveMeaning from Degree — the generic meaning is
grounded in scalar adjective semantics by construction, not by bridge theorem.
Model #
Interpretation model (L0, Eq. 1): L(p, θ | u) ∝ δ_{⟦u⟧(p,θ)} · P(θ) · P(p)
Endorsement model (S1, Eq. 3): S(u | p) ∝ (∫_θ L(p, θ | u) dθ)^λ
The threshold θ is marginalized BEFORE exponentiation (matching the paper). With N discrete thresholds, the marginalized L0 is: L0(p | generic) ∝ P(p) · |{θ : p > θ}| = P(p) · p.toNat
This analytical marginalization eliminates the latent variable entirely.
The model is the mathlib-PMF RSA pipeline ([FG12]): the literal
listener L0gen prior u : PMF Prevalence normalises the marginalized meaning
meaningE prior u p = P(p) · |{θ : ⟦u⟧(p,θ)}| (lifted to ℝ≥0∞), and the speaker
S1gen prior p : PMF Utterance is RSA.S1Belief with α = 1, zero cost. Each
prediction is one application of S1Belief_apply_lt_iff_score_lt (the rsa simp
set): the partition cancels, leaving an L0 comparison that reduces to the cue
validity test p.toNat > E[k | prior]. The prior expectation is the shared
PMF.condExpect of the silent listener's posterior (expectedBin_eq_condExpect).
Parameters #
All parameters from the paper's code (analysis/model-simulations.Rmd,
exampleParameters list, GitHub: mhtess/genlang-paper):
- α = 2 in the paper (experimental fit: 2.47). We use α = 1 since the binary comparison S1(generic) > S1(silent) is α-invariant for α > 0
- Bins: paper uses 98 bins (0.01–0.98); we use 21 bins (0%, 5%, ..., 100%) for exact rational arithmetic. Qualitative predictions are preserved.
- Null component: Beta(1, 50)
| Property | Stable Beta | φ (mix) | Ref. prev. | Paper endorse |
|---|---|---|---|---|
| bark | Beta(5,1) | 0.4 | 95% | 0.88 |
| hasSpots | Beta(5,1) | 0.7 | 10% | 0.02 |
| dontEatPeople | Beta(10,1)* | 1.0 | 80% | 0.41 |
| laysEggs | Beta(10,10) | 0.2 | 50% | 0.95 |
| isFemale | Beta(10,10) | 1.0 | 50% | 0.50 |
| carriesMalaria | Beta(1,30) | 0.1 | 10% | 0.97 |
*Paper uses Beta(50,1); we use Beta(10,1) for tractable arithmetic (avoids k^49 terms). Both give the same qualitative prediction.
Prior Model #
Prevalence priors are mixtures of two Beta distributions (Figure 2): P(p) = φ · Beta_stable(p) / Z_s + (1-φ) · Beta_null(p) / Z_n
where φ is the probability a category has the stable causal mechanism, Beta_stable varies per property, and Beta_null = Beta(1,50) for all properties (representing categories lacking the property mechanism).
Each component is NORMALIZED before mixing (matching the WebPPL code,
which uses categorical to normalize each component independently).
We achieve this without ℚ division by computing:
P(p) ∝ φ · BW_s(p) · Z_n + (1-φ) · BW_n(p) · Z_s
Verified Predictions #
| # | Finding | Prior | p_ref | Theorem |
|---|---|---|---|---|
| 1 | "Dogs bark" endorsed | bark | 95% | bark_endorsed |
| 2 | "Kangaroos have spots" NOT endorsed | hasSpots | 10% | spots_not_endorsed |
| 3 | "Sharks don't eat people" NOT endorsed | dontEatPeople | 80% | dontEatPeople_not_endorsed |
| 4 | "Robins lay eggs" endorsed despite 50% | laysEggs | 50% | laysEggs_endorsed |
| 5 | "Robins are female" borderline at 50% | isFemale | 50% | isFemale_borderline |
| 6 | "Mosquitos carry malaria" endorsed at 10% | carriesMalaria | 10% | malaria_endorsed |
| 7 | Max prevalence satisfies all thresholds | — | — | generic_top_true |
| 8 | Zero prevalence fails all thresholds | — | — | generic_zero_false |
| 9 | Only rareWeak endorsed at 20% | all four causal | 20% | causal_20pct_pattern |
| 10 | 3/4 causal conditions endorsed at 70% | all four causal | 70% | causal_70pct_pattern |
| 11 | Endorsement ⟺ exceeds E[k | prior] | — | — |
Discretized prevalence: 0%, 5%, ..., 100% (21 values). Structurally identical to [LG17]'s Height.
Equations
Instances For
Threshold values θ₀–θ₁₉ (20 values).
Equations
Instances For
Prevalence at p% (bins at 5% increments, so p must be a multiple of 5). Uses a macro so the division is computed at elaboration time.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Threshold at t% (bins at 5% increments, so t must be a multiple of 5). Uses a macro so the division is computed at elaboration time.
Equations
- TesslerGoodman2019.termThrPct_ = Lean.ParserDescr.node `TesslerGoodman2019.termThrPct_ 1022 (Lean.ParserDescr.binary `andthen (Lean.ParserDescr.symbol "thrPct") (Lean.ParserDescr.const `num))
Instances For
Generic vs null utterance. The endorsement model decides between producing the generalization and staying silent.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- TesslerGoodman2019.instDecidableEqUtterance x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
⟦gen⟧(p, θ) = p > θ.
This IS positiveMeaning from Degree — the generic meaning
function is literally the positive scalar adjective meaning applied to the
prevalence scale. Grounded by construction.
Equations
- TesslerGoodman2019.genericMeaning θ p = decide (Degree.positiveMeaning p θ)
Instances For
Full meaning function: utterance × threshold → prevalence → Bool.
Equations
Instances For
Mixture-of-Betas infrastructure #
The paper models prevalence priors as mixtures of two Beta distributions: a stable component (property-specific) and a null component (Beta(1,50), representing categories without the causal mechanism).
Each component is normalized before mixing (matching the WebPPL code where
categorical normalizes each component independently). We achieve this
without ℚ division by computing:
P(k) ∝ φ · BW_stable(k) · Z_null + (1-φ) · BW_null(k) · Z_stable
This is proportional to the correctly normalized mixture since: P(k) = Z_n · Z_s · [φ · BW_s(k)/Z_s + (1-φ) · BW_n(k)/Z_n]
Unnormalized Beta(a,b) weight at bin k ∈ {0,...,20}. Proportional to Beta(a,b) PDF at x = k/20.
Equations
- TesslerGoodman2019.betaWeight a b k = k ^ (a - 1) * (20 - k) ^ (b - 1)
Instances For
Sum of Beta(a,b) weights across all 21 bins.
Equations
- TesslerGoodman2019.betaTotal a b = List.foldl (fun (acc k : ℕ) => acc + TesslerGoodman2019.betaWeight a b k) 0 (List.range 21)
Instances For
Normalized mixture-of-Betas prevalence prior, discretized to 21 bins.
- Stable component: Beta(as, bs) with mixture weight φ
- Null component: Beta(na, nb) with mixture weight (1-φ)
Each component is normalized before mixing by cross-multiplying with the other component's total weight:
P(k) ∝ φ · BW_stable(k) · Z_null + (1-φ) · BW_null(k) · Z_stable
This avoids ℚ division while preserving the correct mixture ratio.
Equations
- One or more equations did not get rendered due to their size.
Instances For
"Bark" prior: bimodal at 0 and ~90% (Figure 2, column 1). Stable Beta(5,1), φ = 0.4.
Equations
- TesslerGoodman2019.barkPrior = TesslerGoodman2019.mixturePrior (2 / 5) 5 1 1 50
Instances For
"Have spots" prior: bimodal at 0 and ~90% (Figure 2, column 2). Stable Beta(5,1), φ = 0.7. Higher φ than bark — more animal categories can have spots than bark.
Equations
- TesslerGoodman2019.haveSpotsPrior = TesslerGoodman2019.mixturePrior (7 / 10) 5 1 1 50
Instances For
"Don't eat people" prior: near-unimodal at ~90% (Figure 2, column 3). Stable Beta(10,1), φ = 1.0. Paper uses Beta(50,1); we use Beta(10,1) for tractable arithmetic (avoids k^49 terms). Both predict NOT endorsed at 80%.
Equations
Instances For
"Lays eggs" prior: bimodal at 0 and ~50% (Figure 2, column 4). Stable Beta(10,10), φ = 0.2. Most animal categories don't have egg-layers (peak at 0); among those that do, only females lay eggs (~50% prevalence).
Equations
- TesslerGoodman2019.laysEggsPrior = TesslerGoodman2019.mixturePrior (1 / 5) 10 10 1 50
Instances For
"Is female" prior: unimodal at ~50% (Figure 2, column 5). Stable Beta(10,10), φ = 1.0. Almost all animal categories have ~50% female members.
Equations
- TesslerGoodman2019.isFemalePrior = TesslerGoodman2019.mixturePrior 1 10 10 1 50
Instances For
"Carries malaria" prior: extreme low prevalence (Figure 2, column 6). Stable Beta(1,30), φ = 0.1. Very few animal categories carry diseases (90% null component). Among those that do, prevalence is very low (Beta(1,30) peaked near 0).
Equations
- TesslerGoodman2019.carriesMalariaPrior = TesslerGoodman2019.mixturePrior (1 / 10) 1 30 1 50
Instances For
Number of thresholds θ ∈ {0,...,19} satisfying p > θ.
For generic: count = p.toNat (0 for p=0, 1 for p=1, ..., 20 for p=20). For silence: count = 20 (all thresholds pass).
Equations
Instances For
The marginalized meaning lifted to ℝ≥0∞: φ(u, p) = P(p) · |{θ : ⟦u⟧(p,θ)}|.
The latent threshold θ is marginalized analytically into the meaning, so the
literal listener normalises this directly (matching the paper's structure, where
θ is integrated out before exponentiation). For the generic the count is
p.toNat; for silence it is the full 20.
Equations
- TesslerGoodman2019.meaningE prior u p = ENNReal.ofReal (prior p * ↑(TesslerGoodman2019.thresholdCount u p))
Instances For
Literal listener L0(·|u) : PMF Prevalence, normalising the marginalized meaning
([frank-goodman-2012]; the threshold-marginalized Eq. 1 of [TG19]).
Equations
- TesslerGoodman2019.L0gen prior u h0 hT = RSA.L0OfMeaning (TesslerGoodman2019.meaningE prior) u h0 hT
Instances For
Total wrapper around L0gen: where the marginal is degenerate (mass 0) the
listener falls back to uniform, so the speaker S1gen below is a total PMF. On
every well-defined prior (tsum ≠ 0) it agrees with L0gen (L0genT_eq).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Pragmatic speaker S1(·|p) ∝ L0(p|u) (α = 1, zero cost), a PMF Utterance
([FG12]; the paper's Eq. 3 endorsement model). Total via L0genT:
where the world p has zero prior mass the speaker degenerates to silence, but
on the worlds the predictions evaluate (positive prior) it is the genuine
RSA.S1Belief over the marginalized listener.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Prevalence 100% satisfies the generic for all thresholds.
Generic meaning at prevalence 0% is false for all thresholds.
The bimodal "lays eggs" prior peaks at zero prevalence.
The unimodal "is female" prior peaks at 50%.
Endorsement model (Eq. 3) #
The paper's key predictions are endorsement rates: given referent prevalence p for a kind k, does the speaker produce the generic?
S(u | p) ∝ (∫_θ L(p,θ|u) dθ)^λ
Endorsement > 50% ⟺ S1(generic | p) > S1(silent | p).
The binary comparison is equivalent to tc(p) > E[tc | prior], i.e., the referent prevalence (in threshold-count units) exceeds the prior expected prevalence. This is the paper's central insight: the SAME prevalence can produce different endorsement rates depending on the prior (Figure 2).
Analytical endorsement condition #
The paper's central analytical result (Appendix A) is that the endorsement comparison reduces to a cue validity test:
S1(generic | p) > S1(silent | p) ⟺ p.toNat > E[k | prior]
i.e., the referent prevalence bin exceeds the prior expected bin.
Proof sketch: S1(u|p) ∝ rpow(L0(p|u), α). Since rpow is monotone for α > 0, the comparison reduces to L0(p|generic) > L0(p|silent). Expanding:
L0(p|u) = meaning(u,p) / Z_u = prior(p) · tc(u,p) / Z_u
For the generic, Z_gen = Σ_w prior(w) · w.toNat; for silence, Z_sil = 20 · Z_prior. Dividing by prior(p) > 0 and cross-multiplying:
p.toNat / Z_gen > 20 / Z_sil ⟺ p.toNat > Z_gen / Z_prior = E[k | prior]
Expected prevalence bin under a prior: E[k | prior] = Σ_k k·P(k) / Σ_k P(k).
Equations
- TesslerGoodman2019.expectedBin prior = (∑ w : TesslerGoodman2019.Prevalence, prior w * ↑(Degree.Degree.toNat w)) / ∑ w : TesslerGoodman2019.Prevalence, prior w
Instances For
The endorsement condition reduces to a cue validity comparison: a generic is endorsed iff the referent prevalence bin exceeds the prior expected bin. This is the paper's central analytical result (Appendix A).
The hypotheses hgen/hsil (the two normalisers are non-degenerate) are exactly
PMF.normalize's well-definedness obligations; they hold for every real prior with
positive total mass and a positive-count referent (see meaningE_generic_ne_zero,
meaningE_silent_ne_zero).
Proof: the rsa simp set reduces the S1Belief comparison to the L0 scores
(S1Belief_apply_lt_iff_score_lt, partition cancels); L0gen_apply expands each to
meaningE · / Σ meaningE; lifting ofReal out cross-multiplies to
p.toNat × Σ prior > Σ prior · toNat, i.e. p.toNat > E[k|prior].
The endorsement boundary is the silent listener's expected prevalence #
The silent utterance is true at every prevalence, so the silent literal listener's
posterior L0gen prior .silent is the (normalised) prevalence prior. Its
PMF.condExpect over all prevalences is therefore exactly expectedBin prior — the
boundary in endorsement_iff_exceeds_expected, expressed through the shared
conditional-expectation API rather than as an ad-hoc ratio.
The silent listener's posterior is the normalised prior: L0(p | silent) = P(p)/Z.
expectedBin is a conditional expectation: the endorsement boundary equals
E[prevalence | silent listener], computed through the shared PMF.condExpect.
Symmetric priors put the endorsement boundary at the centre #
A prior invariant under bin-reflection k ↦ 20 − k has its mean at the central bin
(50% prevalence). This makes the "robins are female" borderline a theorem about
symmetry rather than a numerical coincidence.
A reflection-symmetric prior has its mean exactly at the centre bin (50%).
The "is female" prior — Beta(10, 10) with mixture weight φ = 1 — is invariant
under bin-reflection, since betaWeight a b k = betaWeight b a (20−k) and here
a = b = 10.
"Dogs bark" endorsed at 95% prevalence (Table 1: 95%; Figure 2, column 1: 0.88).
"Robins lay eggs" endorsed at 50% prevalence (Figure 2, column 4: 0.95). Despite only 50% prevalence, the bimodal prior (peaked at 0 and 50%) makes the generic highly informative — it rules out the absent component.
"Mosquitos carry malaria" endorsed at 10% prevalence (Figure 2, column 6: 0.97). The prior expects near-zero prevalence, so even low prevalence is highly informative. This is the model's explanation of "striking property" generics: rare properties have low prior expectations.
"Kangaroos have spots" NOT endorsed at 10% prevalence (Figure 2, column 2: 0.02). Even though the prior has a null component, φ = 0.7 means 70% of the prior mass comes from the stable Beta(5,1) peaked near 100%. At 10% prevalence, the generic is uninformative relative to this high-prevalence expectation.
"Sharks don't eat people" NOT endorsed at 80% prevalence (Figure 2, column 3: 0.41). Even though 80% is high in absolute terms, the prior (φ=1, Beta(10,1)) concentrates nearly all mass above 80%. The generic is uninformative because the listener already expects very high prevalence.
"Robins are female" borderline at 50% prevalence (Figure 2, column 5: 0.50) — as
a symmetry theorem. The Beta(10,10), φ = 1 prior is reflection-symmetric
(isFemalePrior_symm), so its mean is exactly the centre bin
(expectedBin_of_symmetric: 50%); the 50%-referent then sits exactly on the
endorsement boundary, so the generic is no more informative than silence. Not a
numerical coincidence — a consequence of the prior's symmetry.
endorsement_iff_exceeds_expected says the endorsement decision is a single
threshold on the referent prevalence at the prior mean expectedBin. The model's
qualitative content is the behaviour of that threshold, derived below from the law
alone — no prior-specific computation, no decide.
Monotonicity in prevalence (Figure 1C). At a fixed prior, endorsement is
monotone in the referent prevalence: once a prevalence exceeds the prior mean,
every higher prevalence is endorsed too. The endorsement curve is a step up at
expectedBin prior.
Prevalence asymmetry ([Les08]) as a theorem about prior means. At one
and the same referent prevalence p, a property whose prior mean lies below p
is endorsed while one whose prior mean lies at or above p is not. The asymmetry
is therefore not a fact about the prevalence (it is shared) but the ordering of
the two prior means expectedBin prior₁ < p ≤ expectedBin prior₂ — exactly the
paper's explanation of "robins lay eggs" vs. "robins are female".
The classic prevalence asymmetry is EXPLAINED by the endorsement model: same prevalence (50%), different prior shapes → different S1 endorsement rates.
"Robins lay eggs" (true, ~50% prevalence) vs "Robins are female" (odd, ~50% prevalence). [Les08] documents the empirical observation; [TG19] derives the asymmetry from prior shape differences.
laysEggs_endorsed and isFemale_borderline (above) derive the predictions.
Generic endorsement is not prevalence-functional — at the same referent
prevalence (50%), "robins lay eggs" is endorsed but "robins are female" is not.
The verdict is fixed by the property-specific prior, not by the prevalence
ratio. This is the prevalence asymmetry ([Les08]) as a structural fact:
no generalized quantifier whose truth depends only on the cell ratio
|R∩S| : |R∖S| — i.e. no Proportional quantifier, every counting quantifier
in Quantification.Counting including mostOn — can capture generic
endorsement. This is exactly where the majority view fails: contrast
Cohen1999.cohen_proportional, which shows Cohen's θ = 1/2 GEN is proportional
(and hence cannot exhibit this asymmetry).
As α → ∞, the endorsement model sharpens to a categorical decision: endorsed generics get probability 1, non-endorsed get probability 0.
By `rpow_luce_eq_softmax` (Core), every rpow-based Luce choice rule IS
softmax over log scores. The endorsement model inherits all softmax
limit theorems for free.
L0 score for utterance u at prevalence p (unnormalized).
Equations
- TesslerGoodman2019.l0Score prior u p = prior p * ↑(TesslerGoodman2019.thresholdCount u p)
Instances For
The endorsement rate equals softmax over log-L0 scores.
Immediate from rpow_luce_eq_softmax: the endorsement model IS softmax.
When l0_gen > l0_sil (endorsed generic), the endorsement rate → 1
as α → ∞. Direct corollary of Softmax.tendsto_softmax_infty_at_max.
When l0_gen < l0_sil (non-endorsed generic), the endorsement rate → 0.
Case Study 2: Habitual Language #
[TG19] (Case Study 2) extend the generic endorsement model
to habituals. The key insight: habituals ("John runs") use the same threshold
semantics as generics ("Birds fly"), with Prevalence now interpreted as frequency
of activity across occasions rather than proportion of a kind with a property.
Paper's actual prior model (Eq. 4): The paper uses a log-normal + delta mixture:
φ ~ Beta(γ, ξ)
ln(frequency) ~ Gaussian(μ, σ) with probability φ
frequency = 0.01 with probability (1 - φ)
The Beta parameters (γ, ξ) and Gaussian parameters (μ, σ) are fit to empirical frequency estimates from participants. We approximate the fitted priors with Beta mixtures that capture the qualitative predictions:
- Rare-activity priors (e.g., "climbs mountains", "writes novels"): most people never do this → low expected frequency
- High-frequency priors (e.g., "drinks coffee", "drives to work"): common daily activity → high expected frequency
- Moderate priors (e.g., "runs", "cooks dinner"): regular but not constant
The paper reports a model fit of r²(93) = 0.894 on habitual endorsement data.
See also: Genericity.thresholdGeneric (grounded on the canonical
Quantification.thresholdGtOn) for the threshold reading of GEN, completing
the pipeline:
GEN → threshold → uncertain threshold → RSA endorsement.
Frequency prior for "runs": moderate expectation.
Approximates the paper's fitted log-normal prior with a Beta(5,3) mixture.
The paper fits (γ, ξ, μ, σ) to participant frequency estimates;
the exact fitted values are in analysis/model-simulations.Rmd.
Equations
- TesslerGoodman2019.runsPrior = TesslerGoodman2019.mixturePrior (1 / 2) 5 3 1 50
Instances For
Frequency prior for "climbs mountains": rare activity. Approximates the paper's fitted log-normal prior with a Beta(2,6) mixture.
Equations
- TesslerGoodman2019.climbsMountainsPrior = TesslerGoodman2019.mixturePrior (1 / 10) 2 6 1 50
Instances For
Frequency prior for "drinks coffee": high-frequency activity. Approximates the paper's fitted log-normal prior with a Beta(7,2) mixture.
Equations
- TesslerGoodman2019.drinksCoffeePrior = TesslerGoodman2019.mixturePrior (4 / 5) 7 2 1 50
Instances For
"John runs" endorsed at 75% frequency (moderate freq exceeds moderate prior).
"John climbs mountains" endorsed at 25% frequency (low freq exceeds rare-activity prior).
"John drinks coffee" NOT endorsed at 25% frequency (low freq below high-frequency prior).
Habitual prior asymmetry: at the same 25% frequency, "climbs mountains" is endorsed but "drinks coffee" is not — paralleling the generic prevalence asymmetry.
Case Study 3: Causal Language #
[TG19] (Case Study 3, Experiments 3A–3B) extend the model to
causal generics ("Herb X makes cheebas sleepy"). Here Prevalence is
reinterpreted as the causal rate — the proportion of cases where the cause
produces the effect.
Experimental design: In Experiment 3A, participants see "previous experimental results" (a table of substances tested on 100 subjects) that follow one of four distributions, manipulated between subjects:
- common: all substances show similar efficacy (unimodal distribution)
- rare: some substances show no efficacy, others show high (bimodal distribution)
- strong: effective substances produce strong effects (avg ~98%)
- weak: effective substances produce weak effects (avg ~20%)
In Experiment 3B, participants see one of two referent causal rates (20% or 70%) and judge whether the causal generalization holds ("Herb C makes cheebas sleepy").
We model the four conditions as different prevalence priors, varying the mixture weight φ (common → high φ, rare → low φ) and the stable Beta parameters (strong → high-mean Beta, weak → low-mean Beta). These are approximations of the empirically elicited priors from Experiment 3A, not exact replications.
The paper reports a model fit of r²(8) = 0.835 on causal endorsement data (Figure 11B).
Prior for common-strong cause: most categories have the mechanism (φ=0.75), and the mechanism is highly effective (Beta(10,1) peaked near 100%).
Equations
- TesslerGoodman2019.commonStrongPrior = TesslerGoodman2019.mixturePrior (3 / 4) 10 1 1 50
Instances For
Prior for common-weak cause: most categories have the mechanism (φ=0.75), but the mechanism is weakly effective (Beta(2,8) peaked near 20%).
Equations
- TesslerGoodman2019.commonWeakPrior = TesslerGoodman2019.mixturePrior (3 / 4) 2 8 1 50
Instances For
Prior for rare-strong cause: few categories have the mechanism (φ=0.25), but when present it is highly effective (Beta(10,1)).
Equations
- TesslerGoodman2019.rareStrongPrior = TesslerGoodman2019.mixturePrior (1 / 4) 10 1 1 50
Instances For
Prior for rare-weak cause: few categories have the mechanism (φ=0.25), and the mechanism is weakly effective (Beta(2,8)).
Equations
- TesslerGoodman2019.rareWeakPrior = TesslerGoodman2019.mixturePrior (1 / 4) 2 8 1 50
Instances For
Rare-weak cause endorsed at 20% causal rate: low prior expectation makes even 20% informative.
Common-strong cause NOT endorsed at 20% causal rate: high prior expectation (peaked near 100%) makes 20% uninformative.
Rare-weak cause endorsed at 70% causal rate.
Common-strong cause NOT endorsed at 50% causal rate: high prior (Beta(10,1), φ=0.75) puts expected rate near 70%, so 50% is uninformative. Note: the paper tests at 20% and 70%. At 70%, the comparison is borderline (E[k|prior] ≈ 14 ≈ bin(70%)), matching the paper's ~50% endorsement rate at referent prevalence 0.7 for common-strong (Figure 11B).
Rare-strong cause NOT endorsed at 20% causal rate (Figure 11B: ~35% endorsement). Despite fewer competing causes than common-strong, the prior still concentrates enough mass above 20% (via Beta(10,1)) to make 20% uninformative.
Rare-strong cause endorsed at 70% causal rate (Figure 11B: ~90% endorsement).
Common-weak cause endorsed at 70% causal rate (Figure 11B: ~75% endorsement). With Beta(2,8) peaked near 20%, a referent rate of 70% far exceeds the prior expectation.
Causal prior asymmetry (Experiment 3B): at 20% referent rate, only rare-weak is endorsed; the other three conditions are not. This matches the paper's Figure 11B (left panel).
At 70% referent rate, all conditions except common-strong are endorsed (Figure 11B). Common-strong is borderline (~50% endorsement in the paper), matching our model's E[k|prior] ≈ bin(70%).
Cue Validity and Endorsement #
[TG19] (pp. 29-30, Appendix A) show that endorsement in the infinite-rationality limit reduces to a cue validity comparison:
endorsed ⟺ prevalence(f, k_ref) > E_prior[prevalence]
⟺ cue_validity(f, k_ref) > 1
where cue_validity(f, k) = prevalence(f, k) / E[prevalence].
This connects the RSA model to the classical notion from [rosch-mervis-1975]: a feature is diagnostic of a category exactly when the feature is more prevalent in that category than expected across categories — i.e., when cue validity > 1.
In S1gen, the endorsement condition
S1(generic | p_ref) > S1(silent | p_ref) reduces to
p_ref.toNat > E[k | prior] after L0 normalization cancels the common factor.
This is exactly the cue validity condition when the expected bin E[k | prior]
serves as the denominator.
Cue validity: ratio of referent prevalence to expected prevalence under the prior.
Equations
- TesslerGoodman2019.cueValidity referentPrevalence expectedPrior = referentPrevalence / expectedPrior
Instances For
A generic is endorsed (prevalence exceeds prior expectation) iff cue validity > 1.
Unified Architecture #
All three domains — generics, habituals, and causal language — are the same
S1gen speaker over meaningE, with different prevalence priors. The threshold
semantics, RSA inference, and endorsement mechanism are shared; only the prior varies.
This unification is structural (by construction), not proven post hoc. The integration pipeline is:
- Traditional operator (GEN/HAB) reduces to threshold semantics
(
Genericity.thresholdGeneric, onQuantification.thresholdGtOn) - Threshold semantics with uncertain threshold → marginalized
meaningE→L0gen - RSA endorsement (
S1gen) decides between generic and silence - Endorsement ≈ cue validity (
endorsed_iff_cue_validity_gt_one)
All three case studies are S1gen of a prior — the prior is the only free
parameter (the speaker engine is shared by construction).