Documentation

Linglib.Studies.DongEtAl2026PMF

[DHH+26]: the Value-of-Information clarify-or-commit model #

A PMF-level formalisation of the decision-theoretic Value of Information (VoI) framework for adaptive human–agent communication. An agent holds a belief b : PMF Θ over latent user intents and may either commit to an action now or clarify by asking a question before committing. The paper operationalises the choice through the classical Value of Information: ask a question only when its expected improvement in the downstream decision outweighs the communication cost.

A question is modelled as an answer kernel κ : Θ → PMF Y — the paper's p(y ∣ q, θ), the distribution of answer y were θ the true intent. Its answer marginal p(y ∣ q, b) and the updated belief b_y are the project's PMF.marginal κ b and PMF.posterior κ b y; weightedPosteriorValue_eq identifies the (total) per-answer term used here with p(y) · V(b_y).

Main definitions #

Main statements #

Implementation notes #

Utilities are ℝ≥0∞-valued so the model lives natively on PMF. VoI uses truncated subtraction, but V_le_Vpost makes the gap genuine rather than clipped to 0. Homogeneity needs only s ≠ ∞ (via ENNReal.mul_sub); no finiteness of V/Vpost is assumed, so the core results hold for arbitrary intent, action, and answer types.

The worth-asking region is the strict c < VoI U b κ (equivalently 0 < NetVoI), matching the paper's "commit when max_q NetVoI ≤ 0" rule. The cross-question argmax selection of the policy is out of scope: the results here concern the per-question clarify-or-commit decision.

This PMF/ℝ≥0∞ formulation parallels the -valued expected-information-gain substrate Core.Agent.ExperimentDesign.eig (with value function V U): V_le_Vpost is the PMF analogue of ExperimentDesign.eig_nonneg_of_convex and of TsvilodubEtAl2026.evpi_nonneg.

ClarifyRule is the shared clarify-or-commit decision-rule contract: both this paper and [TMS+26] decide clarification from a net value-of-information signal, this paper through the sharp threshold sharpRule (worthAsking_iff_sharpRule), Tsvilodub et al. through a logistic gate (TsvilodubEtAl2026.softGateRule).

Todo #

Expected utility and the value of acting now #

noncomputable def DongEtAl2026.EU {Θ : Type u_1} {A : Type u_2} (U : ΘAENNReal) (b : PMF Θ) (a : A) :
ENNReal

Expected utility of committing to action a under belief b.

Equations
Instances For
    noncomputable def DongEtAl2026.V {Θ : Type u_1} {A : Type u_2} (U : ΘAENNReal) (b : PMF Θ) :
    ENNReal

    Value of acting now: the best expected utility achievable under b.

    Equations
    Instances For

      The value of a question #

      noncomputable def DongEtAl2026.weightedPosteriorValue {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) (y : Y) :
      ENNReal

      Per-answer contribution to the post-question value, written through the joint b θ · κ θ y so it is total — answers with zero marginal contribute 0 without needing a posterior. Equals p(y) · V(b_y) whenever the answer marginal is non-zero; see weightedPosteriorValue_eq.

      Equations
      Instances For
        noncomputable def DongEtAl2026.Vpost {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) :
        ENNReal

        Expected value after asking question κ (the per-intent answer model p(y ∣ q, θ)): the answer-marginal expectation of the value of acting on the resulting posterior belief.

        Equations
        Instances For
          noncomputable def DongEtAl2026.VoI {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) :
          ENNReal

          Value of information of question κ: how much the best achievable expected utility improves, in expectation, by asking before committing.

          Equations
          Instances For
            noncomputable def DongEtAl2026.NetVoI {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (c : ENNReal) (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) :
            ENNReal

            Net value of information: VoI minus the per-question communication cost c.

            Equations
            Instances For
              def DongEtAl2026.worthAsking {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (c : ENNReal) (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) :

              The clarify-or-commit decision: asking κ is worth its cost exactly when its value of information exceeds the communication cost (equivalently 0 < NetVoI; see netVoI_pos_iff_worthAsking).

              Equations
              Instances For
                theorem DongEtAl2026.weightedPosteriorValue_eq {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) (y : Y) (h : PMF.marginal κ b y 0) :
                weightedPosteriorValue U b κ y = PMF.marginal κ b y * V U (PMF.posterior κ b y h)

                The per-answer term equals the answer probability times the value of acting on the posterior — the bridge to the project's PMF.posterior substrate (b_y) and PMF.marginal (p(y)).

                Information never has negative value #

                theorem DongEtAl2026.V_le_Vpost {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) :
                V U b Vpost U b κ

                Value of information is nonnegative: in expectation, the option to update the belief before acting can only help. The decision-theoretic core of the framework.

                theorem DongEtAl2026.V_add_VoI {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) :
                V U b + VoI U b κ = Vpost U b κ

                VoI is the honest gap between acting-now and acting-after-asking, not a value clipped to 0 by truncated subtraction.

                Stakes: positive-homogeneity of the value of information #

                @[simp]
                theorem DongEtAl2026.EU_smul {Θ : Type u_1} {A : Type u_2} (s : ENNReal) (U : ΘAENNReal) (b : PMF Θ) (a : A) :
                EU (s U) b a = s * EU U b a

                Expected utility is homogeneous of degree one in the utility.

                @[simp]
                theorem DongEtAl2026.V_smul {Θ : Type u_1} {A : Type u_2} (s : ENNReal) (U : ΘAENNReal) (b : PMF Θ) :
                V (s U) b = s * V U b

                The value of acting now is homogeneous of degree one in the utility.

                @[simp]
                theorem DongEtAl2026.weightedPosteriorValue_smul {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (s : ENNReal) (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) (y : Y) :
                weightedPosteriorValue (s U) b κ y = s * weightedPosteriorValue U b κ y

                Each per-answer post-question term is homogeneous of degree one in the utility.

                @[simp]
                theorem DongEtAl2026.Vpost_smul {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (s : ENNReal) (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) :
                Vpost (s U) b κ = s * Vpost U b κ

                The post-question value is homogeneous of degree one in the utility.

                theorem DongEtAl2026.VoI_smul {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} {s : ENNReal} (hs : s ) (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) :
                VoI (s U) b κ = s * VoI U b κ

                Scaling every utility by a finite stakes factor s scales the value of information by s: VoI is positive-homogeneous of degree one.

                theorem DongEtAl2026.netVoI_pos_iff_worthAsking {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (c : ENNReal) (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) :
                0 < NetVoI c U b κ worthAsking c U b κ

                The agent's policy is worthAsking exactly when net value of information is positive — the strict side of the paper's NetVoI ≤ 0 commit rule.

                theorem DongEtAl2026.worthAsking_mono_stakes {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} {s s' : ENNReal} (hs : s ) (hs' : s' ) (hss' : s s') (c : ENNReal) (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) (h : worthAsking c (s U) b κ) :
                worthAsking c (s' U) b κ

                Stakes monotonicity: if a question is worth asking at stakes s, it remains worth asking at any higher (finite) stakes s'. Raising the stakes shrinks the region in which the agent commits without clarifying.

                theorem DongEtAl2026.medical_worth_asking_of_animal {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (c : ENNReal) (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) (h : worthAsking c U b κ) :
                worthAsking c (10 U) b κ

                The Mixed-Stakes mechanism, holding belief, question, and cost fixed: a question worth its cost at low (animal, U = 1) stakes is worth its cost at high (medical, U = 10) stakes, because scaling utility scales VoI.

                Uninformative questions carry no value #

                theorem DongEtAl2026.Vpost_const_eq_V {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (U : ΘAENNReal) (b : PMF Θ) (q : PMF Y) :
                (Vpost U b fun (x : Θ) => q) = V U b

                A constant answer kernel fun _ => q (the answer distribution does not depend on the true intent θ) leaves the post-question value equal to the value of acting now: the posterior never moves off the prior. The structural converse of V_le_Vpost.

                theorem DongEtAl2026.VoI_const_eq_zero {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (U : ΘAENNReal) (b : PMF Θ) (q : PMF Y) :
                (VoI U b fun (x : Θ) => q) = 0

                An uninformative question has zero value of information.

                theorem DongEtAl2026.not_worthAsking_const {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} (c : ENNReal) (U : ΘAENNReal) (b : PMF Θ) (q : PMF Y) :
                ¬worthAsking c U b fun (x : Θ) => q

                Negative test: an uninformative question is never worth asking, for any cost c (including c = 0). Truncated subtraction does not manufacture spurious value when the answer is independent of the intent.

                A worked Mixed-Stakes 20 Questions instance #

                The binary core of the 20-Questions task: two candidate targets, a guess for each, and a perfectly-informative yes/no question. This is a positive witness — worthAsking is genuinely satisfiable, not vacuous — and shows the animal → medical transfer concretely.

                def DongEtAl2026.correctnessUtility :
                BoolBoolENNReal

                Correct-guess utility: 1 if the guessed action matches the target, else 0.

                Equations
                Instances For
                  noncomputable def DongEtAl2026.uniformBelief :
                  PMF Bool

                  Uniform prior belief over the two targets.

                  Equations
                  Instances For
                    noncomputable def DongEtAl2026.revealingQuestion :
                    BoolPMF Bool

                    A perfectly-informative question: the answer reveals the target.

                    Equations
                    Instances For

                      A revealing question is worth asking at zero cost: its value of information is strictly positive (Vpost = 1 > 2⁻¹ = V).

                      …and therefore worth asking at the high (medical) stakes too.

                      The decision rule: a sharp threshold #

                      A clarify-or-commit decision rule: clarification propensity as a monotone [0, 1]-valued function of the net value-of-information signal (value minus cost). The shared contract of this paper's sharp threshold and [TMS+26]'s logistic gate (TsvilodubEtAl2026.softGateRule).

                      • propensity :

                        Clarification propensity given the net value signal.

                      • mono : Monotone self.propensity
                      • nonneg (x : ) : 0 self.propensity x
                      • le_one (x : ) : self.propensity x 1
                      Instances For

                        The sharp-threshold rule: clarify exactly when the net value is positive (the paper's worthAsking; see worthAsking_iff_sharpRule).

                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For

                          The sharp rule is binary — the formal signature of a threshold process, against which soft gates contrast (TsvilodubEtAl2026.softGateRule_apply_zero).

                          theorem DongEtAl2026.worthAsking_iff_sharpRule {Θ : Type u_1} {A : Type u_2} {Y : Type u_3} {c : ENNReal} (hc : c ) (U : ΘAENNReal) (b : PMF Θ) (κ : ΘPMF Y) (hV : VoI U b κ ) :
                          worthAsking c U b κ sharpRule.propensity ((VoI U b κ).toReal - c.toReal) = 1

                          The account instantiates sharpRule: asking is worth its cost exactly when the sharp-threshold rule fires on the net (real-valued) value signal. The soft-gate rival is TsvilodubEtAl2026.softGateRule.