Documentation

Linglib.Phenomena.Nonliteral.Metaphor.KaoEtAl2014PMF

@cite{kao-etal-2014-metaphor} on mathlib PMF #

@cite{kao-etal-2014-metaphor}

"Formalizing the Pragmatics of Metaphor Understanding" Proceedings of the Annual Meeting of the Cognitive Science Society 36, 719-724.

Model (verified against paper) #

Parameters (paper §"Model Evaluation"):

Softmax-of-log at the root, not rpow workaround #

The speaker utility U = log Σ L0 returns -∞ when the QUD-projected L0 marginal is 0. Mathlib's Real.log 0 = 0 would silently break this — exp(λ · 0) = 1 would give nonzero weight to impossible utterances.

The substrate fix: PMF.softmax takes score : α → EReal, with EReal.exp(⊥) = 0 correctly handling impossible utterances. The Kao S1 is then PMF.softmax (fun u => (α : EReal) * ENNReal.log (qudProjL0 g u f⃗)) — directly the paper's formula with no rpow workaround. Mathematically equivalent to (qudProjL0)^λ (via ENNReal.rpow_eq_exp_mul_log) when qudProjL0 > 0, but the EReal form makes the boundary case explicit.

For Kao's empirical priors (Experiment 1b), all featurePriorℕ entries are strictly positive (min = 379), so all QUD-marginals are positive and the boundary doesn't bite numerically. The substrate handles the general case correctly anyway — RSA papers with sparser literal-listener support inherit the right behaviour.

Joint posterior over (World × Goal) #

Kao's L1 jointly infers the speaker's category-and-features and marginalises over goal. PMF-natively: posterior of World × Goal projected to the World-marginal — exactly PMF.posterior_fst_apply from Core/Probability/JointPosterior.lean.

This file uses the equivalent kernel-form: World → PMF Cat defined as goalPrior.bind ∘ s1, folding the goal-marginalisation into the kernel. The two formulations agree by associativity of bind.

§0. Domain types #

Categories: whale (metaphor vehicle) and person (literal referent). Categories double as utterance types.

Instances For
    @[implicit_reducible]
    Equations
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For
      @[implicit_reducible]
      Equations
      • One or more equations did not get rendered due to their size.

      QUDs: which feature is the speaker trying to communicate?

      Instances For
        @[implicit_reducible]
        Equations
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For
          @[implicit_reducible]
          Equations
          • One or more equations did not get rendered due to their size.
          @[reducible, inline]

          Feature vector: 3 booleans (e.g. {large, graceful, majestic} for whale).

          Equations
          Instances For

            §1. Empirical priors (Experiment 1b counts ×10000) #

            Feature prior P(f⃗ | c) as integer counts (×10000). Both per-category sums = 10000 by construction.

            Equations
            Instances For

              Every entry of the feature prior is strictly positive. The witness that drives every positivity proof in the file.

              §2. PMF construction #

              Feature prior as a PMF over Features, parametric in category.

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For

                Category prior as a PMF over Cat.

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  Joint world prior P(c, f⃗) = P(c) · P(f⃗|c).

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    worldPmf is full-support: every world has positive joint prior mass (since both catPriorℕ and featurePriorℕ are strictly positive across the full domain). Discharged via mem_support_bind_iff + mem_support_map_iff from mathlib.

                    §3. Literal listener L0 #

                    L0(c, f⃗ | u) = P(f⃗|c) if c = u, else 0. PMF-form: deterministic embedding of featurePmf u into worlds with category u.

                    Literal listener: PMF over World concentrated on c = u.

                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      §4. QUD projection #

                      noncomputable def Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.qudProjL0 (g : Goal) (u : Cat) (f : Features) :
                      ENNReal

                      The QUD-projected L0 marginal at (g, u, f⃗): sum of featurePmf u f' over the QUD-equivalence class of f⃗.

                      Built from the parametric RSA.QUD.proj substrate — the same primitive shared with Kao 2014 hyperbole and Kao 2015 irony.

                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For

                        The QUD-projected L0 marginal at (g, u, f⃗) is bounded below by featurePmf u f⃗ (since f⃗ is in its own equivalence class).

                        §5. Speaker S1 (softmax of log-utility, Eq. 1-2) #

                        Score s1Score α g f u = (α : EReal) * ENNReal.log (qudProjL0 g u f). EReal-valued: -∞ when qudProjL0 = 0, finite otherwise. The softmax substrate PMF.softmax correctly handles -∞ (gives 0 mass).

                        For Kao's data, qudProjL0 > 0 always, so the score is always finite — the substrate's -∞ handling isn't load-bearing here, but is the right default for the general RSA pattern.

                        noncomputable def Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.s1Score (α : ) (g : Goal) (f : Features) (u : Cat) :
                        EReal

                        Speaker score s1Score α g f u = λ · log(qudProjL0 g u f) : EReal.

                        Equations
                        Instances For
                          theorem Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.s1Score_ne_top (α : ) ( : 0 α) (g : Goal) (f : Features) (u : Cat) :
                          s1Score α g f u

                          The score is never +∞: direct from substrate PMF.coe_mul_log_ne_top (real coefficient × log of finite ENNReal).

                          theorem Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.s1Score_ne_bot (α : ) ( : 0 α) (g : Goal) (f : Features) (u : Cat) :
                          s1Score α g f u

                          The score is never −∞: direct from substrate PMF.coe_mul_log_ne_bot (real coefficient × log of nonzero ENNReal).

                          noncomputable def Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.s1 (α : ) ( : 0 α) (g : Goal) (f : Features) :
                          PMF Cat

                          Speaker S1: softmax of QUD-projected log-utility (Eq. 1-2).

                          Equations
                          Instances For
                            theorem Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.s1_pos (α : ) ( : 0 α) (g : Goal) (f : Features) (u : Cat) :
                            0 < (s1 α g f) u

                            s1 is full-support: every utterance has positive speaker mass. Direct from the substrate softmax_pos_iff_score_ne_bot: the score is finite-below at every utterance because qudProjL0 g u f > 0 (full literal-listener support across the QUD-projection in Kao's model).

                            §6. Goal-marginalised speaker #

                            noncomputable def Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.mixedS1 (α : ) ( : 0 α) (goalPrior : PMF Goal) (f : Features) :
                            PMF Cat

                            Goal-marginalised speaker: Σ_g P(g) · S1(· | g, f⃗).

                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For
                              theorem Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.mixedS1_pos (α : ) ( : 0 α) (goalPrior : PMF Goal) {g : Goal} (hg : goalPrior g 0) (f : Features) (u : Cat) :
                              0 < (mixedS1 α goalPrior f) u

                              mixedS1 positivity (the architectural lemma the audit identified). The goal-marginalised speaker assigns positive mass at every utterance, provided some goal has positive prior. Witnessed by the g-th term of the bind-tsum: goalPrior g * s1 ... u > 0 because both factors are positive.

                              §7. Pragmatic listener L1 #

                              L1(c, f⃗ | u) ∝ worldPmf(c, f⃗) · mixedS1(u | f⃗).

                              PMF: posterior of the kernel (c, f⃗) ↦ mixedS1(· | f⃗) against worldPmf.

                              noncomputable def Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.L1Kernel (α : ) ( : 0 α) (goalPrior : PMF Goal) :
                              WorldPMF Cat

                              The kernel for L1: (c, f⃗) ↦ mixedS1(· | f⃗). Independent of c.

                              Equations
                              Instances For
                                theorem Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.L1Kernel_marginal_ne_zero (α : ) ( : 0 α) (goalPrior : PMF Goal) {g : Goal} (hg : goalPrior g 0) (u : Cat) :
                                PMF.marginal (L1Kernel α goalPrior) worldPmf u 0

                                L1 marginal at u is non-zero — needed for posterior discharge. Direct from the witness (.whale, (true, true, true)): worldPmf is full-support (worldPmf_pos) and mixedS1 is positive at every utterance when some goal has positive prior (mixedS1_pos).

                                noncomputable def Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.L1 (α : ) ( : 0 α) (goalPrior : PMF Goal) {g : Goal} (hg : goalPrior g 0) (u : Cat) :
                                PMF World

                                Pragmatic listener L1: posterior over World given utterance u.

                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  §8. Standard goal priors #

                                  Vague QUD: uniform goal prior ("What is he like?").

                                  Equations
                                  Instances For

                                    Specific QUD targeting f₁ ("Is he f₁?"): P(g₁) = 0.6, P(g₂) = P(g₃) = 0.2.

                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      §9. Empirical findings (paper §"Model Evaluation") #

                                      Each finding stated as an outer-measure inequality on the L1 posterior.

                                      Speaker rationality λ = 3 (paper §"Model Evaluation").

                                      Speaker rationality (paper §"Model Evaluation").

                                      Equations
                                      Instances For
                                        @[reducible, inline]

                                        L1 with vague QUD (the most common condition in §"Model Evaluation").

                                        Equations
                                        • One or more equations did not get rendered due to their size.
                                        Instances For
                                          @[reducible, inline]

                                          L1 with specific-f₁ QUD.

                                          Equations
                                          • One or more equations did not get rendered due to their size.
                                          Instances For

                                            §9a. Architectural theorems (parametric in priors) #

                                            The paper's headline isn't "P(person|whale) = 0.994 at Kao's specific empirical priors". It's the architectural claim: rational speech act pragmatics, with QUD-projected utility and a kernel that depends only on features (not category), produces L1 posteriors that combine prior knowledge with speaker-side preferences according to Bayes.

                                            Each architectural theorem below states the structural content. The Kao-specific corollaries in §9b instantiate at the empirical priors.

                                            theorem Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.L1_cat_fibre_lt_iff_inner_sum_lt (α : ) ( : 0 α) (goalPrior : PMF Goal) {g : Goal} (hg : goalPrior g 0) (u c₁ c₂ : Cat) :
                                            (L1 α goalPrior hg u).toOuterMeasure {w : World | w.1 = c₁} < (L1 α goalPrior hg u).toOuterMeasure {w : World | w.1 = c₂} w : World with w.1 = c₁, worldPmf w * (mixedS1 α goalPrior w.2) u < w : World with w.1 = c₂, worldPmf w * (mixedS1 α goalPrior w.2) u

                                            Architectural: posterior-fibre asymmetry follows the unnormalised sums. The structural-decomposition lemma applied to cat-fibres of the L1 posterior: which category L1 favours after observing utterance u reduces to which cat-fibre has more conditional joint mass under worldPmf · mixedS1(u | ·).

                                            This is the architectural form of nonliteral and literal_correct: both findings are instances of "the cat-fibre with more conditional joint mass wins". The numerical content per finding is the inequality of those sums (Kao's specific values vs the model-class generic structure).

                                            theorem Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.L1_feature_event_lt_iff_inner_sum_lt (α : ) ( : 0 α) (goalPrior : PMF Goal) {g : Goal} (hg : goalPrior g 0) (u : Cat) (P Q : WorldProp) [DecidablePred P] [DecidablePred Q] :
                                            (L1 α goalPrior hg u).toOuterMeasure (Finset.filter P Finset.univ) < (L1 α goalPrior hg u).toOuterMeasure (Finset.filter Q Finset.univ) w : World with P w, worldPmf w * (mixedS1 α goalPrior w.2) u < w : World with Q w, worldPmf w * (mixedS1 α goalPrior w.2) u

                                            Architectural: feature-set asymmetry follows the unnormalised sums. Companion of L1_cat_fibre_lt_iff_inner_sum_lt for feature-event sets (used by Findings 2-4 and 5).

                                            The feature_pred predicate carves out a feature-event in World; outer measure of that event under L1 reduces to summing the conditional joint masses over the feature-event-fibre.

                                            §9b. Kao-specific corollaries (paper findings at empirical priors) #

                                            The 6 findings from @cite{kao-etal-2014-metaphor} §"Model Evaluation" expressed as direct outer-measure inequalities at Kao's empirical priors.

                                            Each finding reduces via §9a's architectural theorems to a comparison of conditional joint sums at Kao's specific values. The remaining numerical discharge is the empirical-fit content — sorried with TODO notes; full formalisation requires either (a) the softmaxWeight_natMul_log_eq_pow bridge (substrate, available) plus per-feature gcongr/norm_num chains over the rpow form, or (b) a mixedS1_lower lemma giving an explicit positive lower bound on the speaker contribution.

                                            theorem Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.nonliteral :
                                            (vagueL1 Cat.whale).toOuterMeasure {w : World | w.1 = Cat.person} > (vagueL1 Cat.whale).toOuterMeasure {w : World | w.1 = Cat.whale}

                                            Finding 1 (Nonliteral interpretation): hearing "whale", listener infers person, not literally whale. Paper: P(c_p|u_whale) = 0.994.

                                            By L1_cat_fibre_lt_iff_inner_sum_lt, reduces to comparing conditional joint sums on the two cat-fibres. The 99× catPrior asymmetry dominates the bounded speaker contribution.

                                            theorem Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.feature_large :
                                            (vagueL1 Cat.whale).toOuterMeasure {w : World | w.2.1 = true} > (vagueL1 Cat.whale).toOuterMeasure {w : World | w.2.1 = false}

                                            Finding 2 (Feature elevation: large): hearing "whale" raises P(large = T). By L1_feature_event_lt_iff_inner_sum_lt, reduces to comparing conditional joint sums over the two feature-event-fibres. Whale's featurePrior is concentrated on large = T.

                                            theorem Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.feature_graceful :
                                            (vagueL1 Cat.whale).toOuterMeasure {w : World | w.2.2.1 = true} > (vagueL1 Cat.whale).toOuterMeasure {w : World | w.2.2.1 = false}

                                            Finding 3 (Feature elevation: graceful).

                                            theorem Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.feature_majestic :
                                            (vagueL1 Cat.whale).toOuterMeasure {w : World | w.2.2.2 = true} > (vagueL1 Cat.whale).toOuterMeasure {w : World | w.2.2.2 = false}

                                            Finding 4 (Feature elevation: majestic).

                                            theorem Phenomena.Nonliteral.Metaphor.KaoEtAl2014.PMF.context_sensitivity :
                                            (specificL1 Cat.whale).toOuterMeasure {w : World | w.2.1 = true} > (vagueL1 Cat.whale).toOuterMeasure {w : World | w.2.1 = true}

                                            Finding 5 (Context sensitivity): specific QUD raises P(large = T). Cross-config comparison: same outer-measure target, different goalPrior.

                                            Finding 6 (Literal correctness): hearing "person", listener correctly infers person. Both prior AND speaker preferences agree on .person; the cleanest of the 6 findings.

                                            Demonstrates the architectural-theorem usage: convert Set notation to Finset.filter, apply L1_cat_fibre_lt_iff_inner_sum_lt, sorry the remaining inner-sum inequality (the empirical-fit content).

                                            §10. Cross-paper engagement #

                                            @cite{frank-goodman-2012} is the architectural ancestor — basic L0/S1/L1 without QUD inference. Kao's contribution is the joint inference over goals, opening the door to nonliteral interpretations.

                                            @cite{goodman-stuhlmuller-2013} (Phenomena/ScalarImplicatures/Studies/GoodmanStuhlmuller2013PMF.lean) shares the hypergeometric-kernel architecture with Kao's L0 (both use "P(features|category) if categories match"). The architectural difference: G&S2013 is one-step Bayesian (no goal inference); Kao's L1 is two-stage (joint goal-inference enables metaphorical readings).

                                            @cite{kao-etal-2014-hyperbole} (sister paper, same conference) uses the identical RSA architecture for hyperbole (with quantity rather than category as the literally-false dimension). Migration of the hyperbole file would reuse most of this file's substrate.