@cite{ramotowska-marty-romoli-santorio-2025} - Counterfactuals and Quantificational Force #
Ramotowska, S., Marty, P., Romoli, J. & Santorio, P. (2025). Counterfactuals and quantificational force: Experimental evidence for selectional semantics. Semantics & Pragmatics 18, Article 6: 1–43.
Finding #
Quantifier STRENGTH determines graded truth-value judgments for counterfactuals embedded under quantifiers, not polarity or QUD.
This supports the SELECTIONAL theory (Stalnaker + supervaluation) over:
- Homogeneity theory (von Fintel/Križ): predicts QUD × polarity interaction
- Universal theory (Lewis/Kratzer): predicts determinate true/false
Experimental Paradigm #
Two experiments using graded truth-value judgments (0–99 slider from "completely false" to "completely true"). QUD manipulated between subjects: E-QuD (existential: "at least one has a chance to win") vs U-QuD (universal: "all are guaranteed to win").
- Experiment 1 (n=87 after exclusion): Lottery scenarios where "only some of the tickets that have been bought win a prize."
- Experiment 2 (n=94 after exclusion): Card game with 4 players; mixed scenario has 2/4 red cards (win) and 2/4 gray cards (lose). Also tested plural definite sentences alongside counterfactuals.
Test sentences (Experiment 2):
- "All/None/Some/Not all of the players would have won if they had played and finished this round."
Key Results #
- Strong quantifiers (every, no): mean ratings < 15 (Exp 1), < 4 (Exp 2)
- Weak quantifiers (some, not every): mean ratings > 84 (Exp 1), > 82 (Exp 2)
- STRENGTH: β = −77.09, p < .001 (Exp 1); β = −88.7, p < .001 (Exp 2)
- QUD: not significant for counterfactuals (Exp 1: β = −0.09, p = .97; Exp 2: β = −0.6, p = 0.7)
- Plural definites WERE sensitive to QUD (Exp 2: β = −12.6, p = 0.01; raw means E-QuD M=41.0, U-QuD M=22.8), confirming QUD manipulation effective
Equations
- RamotowskaEtAl2025.instReprTheory = { reprPrec := RamotowskaEtAl2025.instReprTheory.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- RamotowskaEtAl2025.instDecidableEqTheory x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Quantifiers tested in the experiment.
- every : Quantifier
- some : Quantifier
- no : Quantifier
- notEvery : Quantifier
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- RamotowskaEtAl2025.instDecidableEqQuantifier x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Map local quantifiers to canonical Strength (B&C Table II).
Equations
- RamotowskaEtAl2025.Quantifier.every.strength = Theories.Semantics.Quantification.Lexicon.Strength.strong
- RamotowskaEtAl2025.Quantifier.no.strength = Theories.Semantics.Quantification.Lexicon.Strength.strong
- RamotowskaEtAl2025.Quantifier.some.strength = Theories.Semantics.Quantification.Lexicon.Strength.weak
- RamotowskaEtAl2025.Quantifier.notEvery.strength = Theories.Semantics.Quantification.Lexicon.Strength.weak
Instances For
Quantifier strength classification, derived from canonical Strength.
Equations
Instances For
Quantifier polarity classification.
Equations
Instances For
QUD type manipulated between subjects.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- RamotowskaEtAl2025.instReprQuDType = { reprPrec := RamotowskaEtAl2025.instReprQuDType.repr }
Equations
- RamotowskaEtAl2025.instDecidableEqQuDType x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Selectional theory predictions (Table 3): QUD-independent. Strong quantifiers → rejected (low ratings), weak quantifiers → accepted (high ratings).
Equations
Instances For
Homogeneity theory predictions (Table 3): QUD-dependent. Positive quantifiers: high under E-QuD, low under U-QuD. Negative quantifiers: low under E-QuD, high under U-QuD. The predicted interaction is between QUD and polarity, not strength.
Equations
- RamotowskaEtAl2025.homogeneityPredictedHigh q RamotowskaEtAl2025.QuDType.existential = true
- RamotowskaEtAl2025.homogeneityPredictedHigh q RamotowskaEtAl2025.QuDType.universal = false
- RamotowskaEtAl2025.homogeneityPredictedHigh q RamotowskaEtAl2025.QuDType.existential = false
- RamotowskaEtAl2025.homogeneityPredictedHigh q RamotowskaEtAl2025.QuDType.universal = true
Instances For
Experimental datum: mean slider rating (0–99 scale) for a condition. 0 = "completely false", 99 = "completely true".
- quantifier : Quantifier
- qud : QuDType
- meanRating : ℚ
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Experiment 2 Results (card game, n=94) #
Experiment 2 (§6) provides per-condition mean slider ratings for target counterfactual (TC) sentences in the mixed scenario, reported in §6.7.3 (p. 6:34). Values verified against raw CSV data (OSF: osf.io/3jywr); paper rounds raw means to 1 decimal place.
Experiment 2: mean slider ratings for counterfactuals in mixed scenarios. Verified against raw CSV data (OSF) and paper §6.7.3.
Strong quantifiers (every/all, none): all means < 4. Weak quantifiers (not all, some): all means > 82.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Experiment 1 Results (lottery, n=87) #
Experiment 1 (§5) uses lottery scenarios where "only some of the tickets that have been bought win a prize." Mean slider ratings (0–99) for counterfactual sentences in the mixed scenario. No per-quantifier × per-QUD breakdown is reported; the paper reports marginal means by STRENGTH (collapsing across QUD and polarity).
Key results from the mixed-effects model (§5.7.2):
- STRENGTH: β = −77.09, p < .001
- QUD: β = −0.09, p = .97 (not significant)
Experiment 1: marginal means by quantifier strength in mixed scenarios. These are the key data points establishing the strength effect (strong < 15, weak > 84). Values verified from raw CSV data (OSF: osf.io/3jywr).
- isStrong : Bool
- meanRating : ℚ
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- RamotowskaEtAl2025.experiment1Marginals = [{ isStrong := true, meanRating := 1705 / 150 }, { isStrong := false, meanRating := 13086 / 146 }]
Instances For
Key empirical observation: Strength, not polarity or QUD, determines truth-value judgments for counterfactuals in mixed scenarios.
Strong quantifiers (every, no) have uniformly low mean ratings (< 4/99). Weak quantifiers (some, not every) have uniformly high ratings (> 82/99). QUD has no significant effect on counterfactual ratings.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Strength effect: all strong quantifier ratings are below 5/99 and all weak quantifier ratings are above 80/99 in the mixed scenario.
This extreme separation rules out chance variation and confirms that strength is the dominant factor.
QUD has no effect on counterfactuals: within each quantifier, E-QuD and U-QuD ratings are close (differ by < 5 points on 0–99 scale).
This is the key prediction of the selectional theory (QUD-independent) and against the homogeneity theory (which predicts QUD × polarity).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Selectional theory succeeds: predictions match data.
The selectional theory predicts that quantifier strength determines ratings regardless of QUD. This matches the observed pattern: strong quantifiers uniformly rejected, weak uniformly accepted, with no QUD modulation.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Homogeneity theory fails: predicted QUD × polarity interaction absent.
The homogeneity theory predicts that positive quantifiers (every, some) should be rated HIGH under E-QuD but LOW under U-QuD, and vice versa for negative quantifiers. The data shows no such interaction:
- "every" is low under BOTH QUDs (~1.2 and ~1.5)
- "some" is high under BOTH QUDs (~97.2 and ~96.1)
Equations
- One or more equations did not get rendered due to their size.
Instances For
The homogeneity theory makes wrong predictions for 4 of 8 conditions.
Under U-QuD, homogeneity predicts:
- every → low (✓ observed: 1.5)
- some → low (✗ observed: 96.1)
- no → high (✗ observed: 3.3)
- not every → high (✓ observed: 82.1)
Under E-QuD, homogeneity predicts:
- every → high (✗ observed: 1.2)
- some → high (✓ observed: 97.2)
- no → low (✓ observed: 0.9)
- not every → low (✗ observed: 86.1)
Mixed-Effects Model Results (Table 5 of paper) #
Experiment 2 target counterfactual sentences, linear mixed-effects model with POLARITY, STRENGTH, QUD and interactions as predictors:
| Effect | β | p |
|---|---|---|
| INTERCEPT | 46.1 | < .001 |
| STRENGTH | −88.7 | < .001 |
| QUD | −0.6 | 0.7 |
| POLARITY | 5.9 | < .001 |
| QUD:POLARITY | 0.3 | 0.9 |
| STRENGTH:QUD | 3.9 | 0.2 |
| STRENGTH:POLARITY | −13.2 | < .001 |
| STR:POL:QUD | −5.3 | 0.4 |
Key findings:
- STRENGTH is the dominant predictor (β = −88.7)
- QUD has no significant main effect or interactions
- POLARITY has a small effect (β = 5.9): "some" rated slightly higher than "not every" within weak quantifiers
- STRENGTH×POLARITY interaction (β = −13.2): the polarity effect is confined to weak quantifiers
Plural Definites vs Counterfactuals: QUD Sensitivity #
Experiment 2 also tested plural definite sentences alongside counterfactuals. The key finding: plural definites ARE sensitive to QUD (β = −12.6, p = 0.01), while counterfactuals are NOT (β = −0.6, p = 0.7). This dissociation confirms:
- The QUD manipulation was effective (plural definites detect it)
- Counterfactuals' QUD insensitivity is a genuine semantic property
- Both phenomena use the same DIST operator, but differ in architecture: plural homogeneity is LOCAL (gap before quantifier), while selectional counterfactuals are GLOBAL (Bool before quantifier)
Plural definite datum: mean slider rating under each QUD condition in the mixed scenario ("The players won this round").
- qud : QuDType
- meanRating : ℚ
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Experiment 2: plural definite mean ratings in mixed scenario. Unlike counterfactuals, these show a significant QUD effect. Raw means from OSF data; paper §6.7.2 reports model-estimated marginals (42.2 / 29.6) which differ slightly.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Bridge: map study quantifiers to formal selectional predictions.
Each quantifier maps to the corresponding projection operation
from the theory layer (Counterfactual.lean).
Equations
- RamotowskaEtAl2025.Quantifier.every.selectionalResult results = Semantics.Conditionals.Counterfactual.embeddedSelectional Semantics.Conditionals.Counterfactual.QStrength.strong results
- RamotowskaEtAl2025.Quantifier.some.selectionalResult results = Semantics.Conditionals.Counterfactual.embeddedSelectional Semantics.Conditionals.Counterfactual.QStrength.weak results
- RamotowskaEtAl2025.Quantifier.no.selectionalResult results = Semantics.Conditionals.Counterfactual.noSelectional results
- RamotowskaEtAl2025.Quantifier.notEvery.selectionalResult results = Semantics.Conditionals.Counterfactual.notEverySelectional results
Instances For
Grounding theorem: the study-level prediction (selectionalPredictedHigh)
agrees with the formal selectional semantics for any mixed input.
This connects the theory layer's three-valued projection operations to the study file's simple strength-based classification. The classification is not stipulated — it is derived from the formal theory by construction.
Why Strength Matters: Local vs Global Aggregation #
The paper's deepest insight (§2.2): whether gaps arise LOCALLY (before
the quantifier) or GLOBALLY (after the quantifier) determines whether
quantifier strength matters. The algebra is Core.Duality.aggregate
applied to two different inputs:
- Homogeneity uses local scope: each individual's counterfactual
is
.indet(gap).aggregate_replicate_indetproves both ∃ and ∀ aggregation return.indet— strength is invisible. - Selectional uses global scope: within each selected world,
individual outcomes are Bool.
aggregate_map_ofBool_mixedproves mixed Bools yield.truefor ∃ and.falsefor ∀ — the strength effect.
Homogeneity architecture erases strength: when gaps arise locally,
both strong and weak quantifiers return .indet. The quantifier's
projection type is invisible — it cannot "see past" gaps.
This is why the homogeneity theory predicts no strength effect and must resort to QUD × polarity to distinguish conditions.
Selectional architecture produces strength effect: when the
quantifier sees only Bools (global scope), mixed inputs yield
.true for weak (∃/disjunctive) and .false for strong (∀/conjunctive).
This connects the study's embeddedSelectional through the bridging
theorem to Duality.aggregate_map_ofBool_mixed.
Selectional counterfactuals are always determinate: under global scope, aggregation over Bools never produces a gap.
This explains why selectional semantics yields crisp true/false judgments (no "undefined"), matching the experimental pattern of extreme slider values (< 4 or > 82).
Why Plural Definites Are QUD-Sensitive But Counterfactuals Are Not #
Experiment 2 tested both counterfactuals and plural definites ("The players won this round") in the same mixed scenarios. The key finding:
- Counterfactuals: QUD has no effect (β = −0.6, p = 0.7)
- Plural definites: QUD has a significant effect (β = −12.6, p = 0.01; E-QuD M=41.0 vs U-QuD M=22.8)
The NonBivalence dichotomy explains this dissociation:
Plural definites use LOCAL trivalence: each individual's predication is evaluated via supervaluation (pluralTruthValue / dist), producing
.indetwhen some-but-not-all atoms satisfy the predicate. The quantifier sees these gaps. Byaggregate_replicate_indet, ALL quantifier types return.indet.Counterfactuals use GLOBAL trivalence: within each selected world, individual outcomes are Boolean. The quantifier sees Bools. By
aggregate_map_ofBool_ne_indet, the result is always definite.
The consequence: when the semantic layer returns .indet, the only
source of variation in judgments is pragmatic resolution — and pragmatic
resolution is QUD-dependent (@cite{kriz-2016}: sufficientlyTrue and
addressesIssue). When the semantic layer returns a determinate value,
there is no gap for pragmatics to exploit — QUD has nothing to modulate.
This is why the SAME mixed scenario produces QUD-sensitivity for PDs but not for CFs: the scope of trivalence determines whether pragmatics gets a foothold.
Plural definites are LOCAL: in mixed scenarios, every quantifier
returns .indet. Strength, polarity, and QUD are all invisible at
the semantic level — the quantifier cannot see past the gap.
Counterfactuals are GLOBAL: in mixed scenarios, every quantifier returns a determinate value. There is no gap for pragmatics to exploit.
The dissociation: for the same mixed input (n individuals, some
satisfying the predicate, some not), PDs return .indet while CFs
return a definite value. The gap is what makes PDs QUD-sensitive —
pragmatic resolution via sufficientlyTrue (@cite{kriz-2016}) depends
on the QUD partition. CFs have no gap to resolve.
This is a direct corollary of the local/global aggregation
decomposition in Core.Duality: local scope produces gaps that pass
through quantifiers; global scope produces Bools that quantifiers
can distinguish.
Related Phenomena #
Local vs Global Aggregation (
Core.Duality.aggregate_*): The paper's deepest architectural insight is formalized as two facts aboutaggregate.homogeneity_erases_strengthderives that local gaps make strength invisible;selectional_strength_effectderives that global Bools produce the strength effect.Plural Definite Dissociation (above):
scope_determines_qud_sensitivityderives the CF/PD dissociation from the NonBivalence dichotomy. Plural definites are LOCAL (gap before quantifier → QUD-sensitive pragmatic resolution); counterfactuals are GLOBAL (Bool before quantifier → no gap to resolve).Modal Homogeneity (@cite{agha-jeretic-2022}): Weak necessity modals (should) are to strong necessity (must) what plural definites are to
all-sentences.shouldEvalproduces.indetin mixed domains (local), whilemustEvalproducesofBool(global). The same NonBivalence dichotomy predicts that embedded should-sentences would be strength-insensitive while embedded must-sentences would show the strength effect.Conditional Excluded Middle (CEM): Stalnaker's semantics validates CEM: (A □→ B) ∨ (A □→ ¬B). See
Counterfactual.leanfor the proof.