[DHH+26]: the Value-of-Information clarify-or-commit model #
A PMF-level formalisation of the decision-theoretic Value of Information
(VoI) framework for adaptive human–agent communication. An agent holds a
belief b : PMF Θ over latent user intents and may either commit to an
action now or clarify by asking a question before committing. The paper
operationalises the choice through the classical Value of Information: ask a
question only when its expected improvement in the downstream decision
outweighs the communication cost.
A question is modelled as an answer kernel κ : Θ → PMF Y — the paper's
p(y ∣ q, θ), the distribution of answer y were θ the true intent. Its
answer marginal p(y ∣ q, b) and the updated belief b_y are the project's
PMF.marginal κ b and PMF.posterior κ b y; weightedPosteriorValue_eq
identifies the (total) per-answer term used here with p(y) · V(b_y).
Main definitions #
EU U b a— expected utility of committing to actionaunder beliefb.V U b— value of acting now,⨆ a, EU U b a.Vpost U b κ— expected value after askingκ,∑' y, p(y) · V(b_y).VoI U b κ— value of information,Vpost U b κ - V U b.NetVoI c U b κ—VoInet of the per-question costc.worthAsking c U b κ— the clarify-or-commit decision,c < VoI U b κ.
Main statements #
V_le_Vpost— information never has negative value:V U b ≤ Vpost U b κ.V_add_VoI—VoIis the honest increment:V U b + VoI U b κ = Vpost U b κ.VoI_smul—VoIis positive-homogeneous in the utility (stakes scale).worthAsking_mono_stakes— holding belief, question, and cost fixed, raising the stakes (scaling the utility) keeps a question worth asking, so the commit-without-asking region shrinks as stakes rise. This is the ceteris-paribus mechanism behind the paper's Mixed-Stakes prediction: scaling utility scalesVoI, so a question clearing a fixed cost at low stakes (U = 1, guessing an animal) clears it at high stakes (U = 10, diagnosing a disease).medical_worth_asking_of_animalis the named instance. The model isolates the utility-scaling mechanism; it does not encode the experiments' differing candidate-set sizes or answer models.
Implementation notes #
Utilities are ℝ≥0∞-valued so the model lives natively on PMF. VoI uses
truncated subtraction, but V_le_Vpost makes the gap genuine rather than
clipped to 0. Homogeneity needs only s ≠ ∞ (via ENNReal.mul_sub); no
finiteness of V/Vpost is assumed, so the core results hold for arbitrary
intent, action, and answer types.
The worth-asking region is the strict c < VoI U b κ (equivalently
0 < NetVoI), matching the paper's "commit when max_q NetVoI ≤ 0" rule.
The cross-question argmax selection of the policy is out of scope: the
results here concern the per-question clarify-or-commit decision.
This PMF/ℝ≥0∞ formulation parallels the ℝ-valued expected-information-gain
substrate Core.Agent.ExperimentDesign.eig (with value function V U):
V_le_Vpost is the PMF analogue of ExperimentDesign.eig_nonneg_of_convex
and of TsvilodubEtAl2026.evpi_nonneg.
ClarifyRule is the shared clarify-or-commit decision-rule contract: both
this paper and [TMS+26] decide clarification from a net
value-of-information signal, this paper through the sharp threshold
sharpRule (worthAsking_iff_sharpRule), Tsvilodub et al. through a
logistic gate (TsvilodubEtAl2026.softGateRule).
Todo #
- Discharge the claim that EVPI (
TsvilodubEtAl2026.evpi) is the upper bound on VoI for any question into a theoremworthAsking c U b κ → c < EVPI. - Relate
VoI/V_le_VposttoCore.Agent.ExperimentDesign.eig/eig_nonneg_of_convex(bridging theℝ≥0∞-on-PMFandℝ-on-Fintypecarriers) so the two statements of "information has nonnegative value" become one fact.
Expected utility and the value of acting now #
Expected utility of committing to action a under belief b.
Equations
- DongEtAl2026.EU U b a = ∑' (θ : Θ), b θ * U θ a
Instances For
Value of acting now: the best expected utility achievable under b.
Equations
- DongEtAl2026.V U b = ⨆ (a : A), DongEtAl2026.EU U b a
Instances For
The value of a question #
Per-answer contribution to the post-question value, written through the
joint b θ · κ θ y so it is total — answers with zero marginal contribute
0 without needing a posterior. Equals p(y) · V(b_y) whenever the answer
marginal is non-zero; see weightedPosteriorValue_eq.
Equations
- DongEtAl2026.weightedPosteriorValue U b κ y = ⨆ (a : A), ∑' (θ : Θ), b θ * (κ θ) y * U θ a
Instances For
Expected value after asking question κ (the per-intent answer model
p(y ∣ q, θ)): the answer-marginal expectation of the value of acting on
the resulting posterior belief.
Equations
- DongEtAl2026.Vpost U b κ = ∑' (y : Y), DongEtAl2026.weightedPosteriorValue U b κ y
Instances For
Value of information of question κ: how much the best achievable
expected utility improves, in expectation, by asking before committing.
Equations
- DongEtAl2026.VoI U b κ = DongEtAl2026.Vpost U b κ - DongEtAl2026.V U b
Instances For
Net value of information: VoI minus the per-question communication
cost c.
Equations
- DongEtAl2026.NetVoI c U b κ = DongEtAl2026.VoI U b κ - c
Instances For
The clarify-or-commit decision: asking κ is worth its cost exactly
when its value of information exceeds the communication cost (equivalently
0 < NetVoI; see netVoI_pos_iff_worthAsking).
Equations
- DongEtAl2026.worthAsking c U b κ = (c < DongEtAl2026.VoI U b κ)
Instances For
The per-answer term equals the answer probability times the value of
acting on the posterior — the bridge to the project's PMF.posterior
substrate (b_y) and PMF.marginal (p(y)).
Information never has negative value #
Value of information is nonnegative: in expectation, the option to update the belief before acting can only help. The decision-theoretic core of the framework.
Stakes: positive-homogeneity of the value of information #
Each per-answer post-question term is homogeneous of degree one in the utility.
The agent's policy is worthAsking exactly when net value of information
is positive — the strict side of the paper's NetVoI ≤ 0 commit rule.
Stakes monotonicity: if a question is worth asking at stakes s, it
remains worth asking at any higher (finite) stakes s'. Raising the stakes
shrinks the region in which the agent commits without clarifying.
The Mixed-Stakes mechanism, holding belief, question, and cost fixed: a
question worth its cost at low (animal, U = 1) stakes is worth its cost at
high (medical, U = 10) stakes, because scaling utility scales VoI.
Uninformative questions carry no value #
A constant answer kernel fun _ => q (the answer distribution does not
depend on the true intent θ) leaves the post-question value equal to the
value of acting now: the posterior never moves off the prior. The structural
converse of V_le_Vpost.
Negative test: an uninformative question is never worth asking, for
any cost c (including c = 0). Truncated subtraction does not manufacture
spurious value when the answer is independent of the intent.
A worked Mixed-Stakes 20 Questions instance #
The binary core of the 20-Questions task: two candidate targets, a guess for
each, and a perfectly-informative yes/no question. This is a positive
witness — worthAsking is genuinely satisfiable, not vacuous — and shows the
animal → medical transfer concretely.
Correct-guess utility: 1 if the guessed action matches the target,
else 0.
Equations
- DongEtAl2026.correctnessUtility θ a = if a = θ then 1 else 0
Instances For
Uniform prior belief over the two targets.
Equations
- DongEtAl2026.uniformBelief = PMF.uniformOfFintype Bool
Instances For
A perfectly-informative question: the answer reveals the target.
Equations
- DongEtAl2026.revealingQuestion θ = PMF.pure θ
Instances For
A revealing question is worth asking at zero cost: its value of
information is strictly positive (Vpost = 1 > 2⁻¹ = V).
…and therefore worth asking at the high (medical) stakes too.
The decision rule: a sharp threshold #
A clarify-or-commit decision rule: clarification propensity as a
monotone [0, 1]-valued function of the net value-of-information signal
(value minus cost). The shared contract of this paper's sharp threshold
and [TMS+26]'s logistic gate
(TsvilodubEtAl2026.softGateRule).
- propensity : ℝ → ℝ
Clarification propensity given the net value signal.
- mono : Monotone self.propensity
Instances For
The sharp-threshold rule: clarify exactly when the net value is
positive (the paper's worthAsking; see worthAsking_iff_sharpRule).
Equations
- One or more equations did not get rendered due to their size.
Instances For
The sharp rule is binary — the formal signature of a threshold process,
against which soft gates contrast (TsvilodubEtAl2026.softGateRule_apply_zero).
The account instantiates sharpRule: asking is worth its cost exactly
when the sharp-threshold rule fires on the net (real-valued) value signal.
The soft-gate rival is TsvilodubEtAl2026.softGateRule.