Documentation

Linglib.Studies.GroveWhite2025

[GW25a] #

Factivity, presupposition projection, and the role of discrete knowledge in gradient inference judgments. Natural Language Semantics 34:1–45.

Core Contribution #

Grove & White compare two hypotheses about the gradience observed in inference judgments for clause-embedding predicates:

Both hypotheses are recorded as FactivityHypothesis.FDH and FactivityHypothesis.FGH, exposed as the two GradienceSource values (resolved vs unresolved indeterminacy). The paper's distinctive formal content is the τ-parameterised model and the 2 × 2 model space crossing factivity discreteness with world-knowledge discreteness.

The Discrete-Factivity Model #

The discrete-factivity model is a ParamPred over FactivityReading:

The graded truth value of a predicate at a world w then unfolds to τ · 1[BEL∧C] + (1−τ) · 1[BEL] (discreteFactivity_gradedTruth). This is the closed-form reduction of the τ-vertex of the paper's DAG (definition (13), p. 20).

The Four Models #

Crossing factivity (discrete/gradient) × world knowledge (discrete/gradient) yields four model variants. The paper reports that the discrete-factivity × gradient-world variant achieves the best ELPD across all four datasets (Sect. 4.3–4.4). The 2 × 2 is captured by ModelVariant, with the discrete-factivity-vs-wholly-gradient pair sharing world-knowledge treatment (best_worst_share_world_knowledge) so that switching factivity hypothesis is the active variable.

Connection to PDS #

The paper's formal framework is Probabilistic Dynamic Semantics (PDS), developed in [GW25b]. The model's graded truth is the PMF.probOfSet (= toOuterMeasure) of the satisfied-readings event under the Bernoulli prior: graded inference judgments emerge from marginalising over a discrete reading parameter, exactly the PDS pattern in which a bind over a discrete probability node feeds a Boolean predicate.

Connection to [ST25] #

[ST25]'s RSA model uses the same factivePos / nonFactivePos foundation from Semantics.Attitudes.Factivity for know / think. The bridges clauseEmbedding_factive_eq_st_know and clauseEmbedding_nonfactive_eq_st_think make this explicit. The S&T binary treatment is the τ → {0, 1} limiting case of the discrete-factivity model (st_is_limiting_case).

Connection to D&T 2021/2022 #

The empirical anchoring is provided by DegenTonhauser2022's aggregate projection ratings: under the discrete-factivity model with τ_know > τ_think, the model predicts the empirically observed know > think projection ordering (empirical_ordering_consistent_with_tau). The prior-belief modulation finding from [DT21] (replicated in 2b) is the specific empirical regularity the world-knowledge component is fit to.

§1. Clause-embedding semantics #

The two readings of a clause-embedding predicate under the FDH. The factive reading triggers a projective inference (BEL ∧ C); the nonfactive reading does not (BEL).

Instances For
    @[implicit_reducible]
    Equations
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For
      @[implicit_reducible]
      Equations
      • One or more equations did not get rendered due to their size.
      theorem GroveWhite2025.FactivityReading.sum_univ {α : Type u_1} [AddCommMonoid α] (f : FactivityReadingα) :
      x : FactivityReading, f x = f factive + f nonfactive

      Sum over FactivityReading.univ reduces to a two-term sum. Used by factivityPrior.mass_sum_one and discreteFactivity_gradedTruth so the Fintype enumeration doesn't reappear inline at every call site.

      The Boolean denotation of a clause-embedding predicate, parameterized by the resolved reading. The two readings dispatch directly to Semantics.Attitudes.FactivityfactivePos and nonFactivePos — so this study shares its foundations with [ST25]'s know / think denotations.

      Equations
      Instances For

        §2. The τ-parameterised prior #

        noncomputable def GroveWhite2025.Rat01.toNNReal (τ : Core.Order.Rat01) :
        NNReal

        τ.val : ℚ lifted to ℝ≥0 via the canonical -coercion. Lives outside the Rat01 namespace because Rat01 is an abbrev for a Subtype, so dot notation on τ : Rat01 resolves through the underlying Subtype rather than the Rat01 namespace.

        Equations
        Instances For

          The Bernoulli prior over FactivityReading: factive with probability τ.val, nonfactive with probability 1 − τ.val. The τ parameter is bundled as Rat01 (↥(Set.Icc (0:ℚ) 1)), so the [0,1] constraint is intrinsic to the type rather than threaded as side hypotheses. This is the τ-vertex of the discrete-factivity DAG (definition (13), p. 20).

          Built from mathlib's PMF.bernoulli (a PMF Bool) by relabeling true ↦ .factive, false ↦ .nonfactive.

          Equations
          • One or more equations did not get rendered due to their size.
          Instances For

            §3. The discrete-factivity ParamPred #

            The discrete-factivity model packaged as a ParamPred: Boolean semantics dispatched on FactivityReading, with a Bernoulli prior over that reading. The graded truth value gradedTruth is then the ℝ≥0∞-valued probability mass on the satisfied-readings set — PMF.probOfSet of the "predicate satisfied at this world" event.

            Equations
            Instances For

              Closed-form reduction: graded truth = τ · 1[factivePos] + (1−τ) · 1[nonFactivePos]. This is the substantive content of the τ-parameterised model — graded inference values arise from a τ-weighted mixture of two crisp Boolean readings.

              Monotonicity in τ: at a world that satisfies the factive reading but not the nonfactive reading, increasing τ strictly increases graded truth. The hypothesis pattern factivePos w ∧ ¬ nonFactivePos w is impossible in standard Boolean semantics (factive_entails_nonfactive rules it out), so this lemma is vacuously achievable; the substantive case is the contrapositive one supplied by discreteFactivity_gradedTruth plus monotonicity of the Bernoulli mixture.

              §4. The 2 × 2 model space #

              Sources of gradience in inference judgment tasks.

              • resolved : GradienceSource

                Resolved on each occasion but varying across occasions (type-level).

              • unresolved : GradienceSource

                Persists even after fixing the interpretation (token-level).

              Instances For
                @[implicit_reducible]
                Equations
                Equations
                • One or more equations did not get rendered due to their size.
                Instances For
                  @[reducible]

                  The choice between the discrete (FDH) and gradient (FGH) hypotheses is a binary choice of source for the gradient projection observations.

                  Defined as @[reducible] def rather than abbrev so the unfolding is explicit (mathlib convention).

                  Equations
                  Instances For

                    The Fundamental Discreteness Hypothesis (definition (7a), p. 10): factivity is a discrete property of an expression on each occasion of use. Observed gradience arises from resolved indeterminacy.

                    Equations
                    Instances For

                      The Fundamental Gradience Hypothesis (definition (7b), p. 10): there is no property distinguishing factive from non-factive occurrences. Gradient distinctions reflect gradient inference contributions.

                      Equations
                      Instances For

                        The four model variants from Sect. 4.3–4.4, crossing factivity (discrete/gradient) × world knowledge (discrete/gradient). Each model is a completion of one of the two norming models (Sect. 4.2) with a factivity component.

                        • discreteFactivity : ModelVariant

                          Discrete factivity + gradient world knowledge. Best fit. Extends norming-gradient (Sect. 4.2.1).

                        • whollyDiscrete : ModelVariant

                          Discrete factivity + discrete world knowledge. Extends norming-discrete (Sect. 4.2.2).

                        • whollyGradient : ModelVariant

                          Gradient factivity + gradient world knowledge. Worst fit.

                        • discreteWorld : ModelVariant

                          Gradient factivity + discrete world knowledge.

                        Instances For
                          @[implicit_reducible]
                          Equations
                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For

                            Two norming-model bases (Sect. 4.2).

                            • gradient : NormingModel

                              Norming-gradient (Sect. 4.2.1): world knowledge as unresolved gradience.

                            • discrete : NormingModel

                              Norming-discrete (Sect. 4.2.2): world knowledge as resolved gradience.

                            Instances For
                              @[implicit_reducible]
                              Equations
                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For

                                The best and worst models share their world-knowledge treatment but differ in factivity hypothesis. This isolates the discreteness of factivity as the variable explaining the ELPD spread between the two extremes.

                                §5. Bridge to [ST25] #

                                The factive reading of clauseEmbeddingSem is the same Boolean predicate as [ST25]'s literalMeaning .knowPos. Both unfold to factivePos from Semantics.Attitudes.Factivity, so the equality is true by construction — a grounding theorem in the sense of CLAUDE.md, witnessing that two paper-specific lexical entries share their foundation.

                                The S&T binary model is the τ → {0, 1} limiting case of the discrete-factivity model: know corresponds to τ_know = 1 (always factive) and think corresponds to τ_think = 0 (never factive). The Grove–White model generalises by allowing intermediate τ values for the same predicate across occasions of use.

                                §6. Empirical anchoring (D&T 2021/2022) #

                                Under the discrete-factivity model with τ_know > τ_think, the model predicts a know > think projection ordering. The empirical ordering from [DT22] (Exp 1a, sliding scale) confirms this direction: aggregate ratings for know exceed those for think. norm_num closes the literal comparison since ratings are -valued.

                                The prior-belief modulation finding from [DT21] (replicated in Exp 2b) is the empirical regularity the world-knowledge component of the discrete-factivity model is fit to: a prior-sensitive (monotone) account predicts that higher prior probability of the complement yields stronger projection (DegenTonhauser2021.sensitive_predicts_modulation), which their data confirm for all 20 predicates.