Core Decision Theory #
@cite{van-rooy-2003} @cite{blackwell-1953}
Theory-neutral decision-theoretic infrastructure: decision problems, expected utility, maximin, and mention-some/mention-all classification.
Promoted from Theories.Semantics.Questions.DecisionTheory so that any module
(RSA, causal decision theory, explanation models) can use decision problems
without pulling in question-semantic types.
Design: Fintype + Finset #
Functions that sum over the full universe use [Fintype W] with ∑ w : W.
Functions that operate over action sets or world subsets use Finset.
questionUtility and expectedVSI take Finset (Finset W) (cells as sets).
Ordering-sensitive operations (questionMaximin, isMentionSome,
resolvingAnswers, …) take a List (Finset W) of cells.
- @cite{van-rooy-2003}. Questioning to Resolve Decision Problems. L&P 26.
- @cite{blackwell-1953}. Equivalent Comparisons of Experiments.
Decision Problems #
A decision problem D = (W, A, U, π) with utility and prior.
- utility : W → A → ℚ
Utility of action a in world w
- prior : W → ℚ
Prior beliefs over worlds (should sum to 1 for proper probability)
Instances For
A uniform prior over a Fintype of worlds
Equations
- Core.DecisionTheory.DecisionProblem.uniformPrior = if Fintype.card W = 0 then fun (x : W) => 0 else fun (x : W) => 1 / ↑(Fintype.card W)
Instances For
Create a decision problem with uniform prior
Equations
- Core.DecisionTheory.DecisionProblem.withUniformPrior utility = { utility := utility, prior := Core.DecisionTheory.DecisionProblem.uniformPrior }
Instances For
Expected Utility #
Expected utility of action a given beliefs.
Equations
- Core.DecisionTheory.expectedUtility dp a = ∑ w : W, dp.prior w * dp.utility w a
Instances For
Value of a decision problem: max EU over actions, or 0 if empty.
Equations
- Core.DecisionTheory.dpValue dp actions = if h : actions.Nonempty then actions.sup' h (Core.DecisionTheory.expectedUtility dp) else 0
Instances For
Conditional expected utility of action a given cell membership.
Equations
- Core.DecisionTheory.conditionalEU dp cell a = if cell.sum dp.prior = 0 then 0 else ∑ w ∈ cell, dp.prior w / cell.sum dp.prior * dp.utility w a
Instances For
Value of decision problem after learning cell (best EU among actions)
Equations
- Core.DecisionTheory.valueAfterLearning dp actions cell = if h : actions.Nonempty then actions.sup' h (Core.DecisionTheory.conditionalEU dp cell) else 0
Instances For
UV(C) = V(D|C) - V(D), the utility value of learning proposition C.
Equations
- Core.DecisionTheory.utilityValue dp actions cell = Core.DecisionTheory.valueAfterLearning dp actions cell - Core.DecisionTheory.dpValue dp actions
Instances For
Probability of a cell in the partition
Equations
- Core.DecisionTheory.cellProbability dp cell = cell.sum dp.prior
Instances For
Maximin #
S(a) = min_w U(w, a), security level of action a.
Equations
- Core.DecisionTheory.securityLevel dp worlds a = if h : worlds.Nonempty then worlds.inf' h fun (w : W) => dp.utility w a else 0
Instances For
MV = max_a min_w U(w, a), maximin value.
Equations
- Core.DecisionTheory.maximinValue dp worlds actions = if h : actions.Nonempty then actions.sup' h fun (a : A) => Core.DecisionTheory.securityLevel dp worlds a else 0
Instances For
Conditional security level: worst case within cell C
Equations
- Core.DecisionTheory.conditionalSecurityLevel dp worlds a c = Core.DecisionTheory.securityLevel dp (worlds ∩ c) a
Instances For
Maximin value after learning C
Equations
- Core.DecisionTheory.maximinAfterLearning dp worlds actions c = Core.DecisionTheory.maximinValue dp (worlds ∩ c) actions
Instances For
Maximin utility value of learning C
Equations
- Core.DecisionTheory.maximinUtilityValue dp worlds actions c = Core.DecisionTheory.maximinAfterLearning dp worlds actions c - Core.DecisionTheory.maximinValue dp worlds actions
Instances For
Resolution #
@cite{van-rooy-2003} p. 736 resolution definition: information c
resolves decision problem (dp, acts) iff some action in acts
weakly dominates every other action on every world in c.
c resolves decision problem (dp, acts): some action in acts
weakly dominates every other action on every world in c.
@cite{van-rooy-2003} p. 736.
Equations
- Core.DecisionTheory.IsResolved dp acts c = ∃ a ∈ acts, ∀ b ∈ acts, ∀ w ∈ c, dp.utility w a ≥ dp.utility w b
Instances For
Decidability of IsResolved under finite, decidable carriers. The
consumer-side prerequisite for decide-based evaluation (e.g.,
List.filter over candidate cells in worked study examples).
Equations
- Core.DecisionTheory.IsResolved.instDecidable dp acts c = id inferInstance
Question Utility #
EUV(Q) = Sum_{q in Q} P(q) * UV(q), expected utility value of question Q.
Equations
- Core.DecisionTheory.questionUtility dp actions cells = ∑ cell ∈ cells, Core.DecisionTheory.cellProbability dp cell * Core.DecisionTheory.utilityValue dp actions cell
Instances For
MV(Q) = min_{q in Q} MV(q), maximin question value.
Equations
- One or more equations did not get rendered due to their size.
- Core.DecisionTheory.questionMaximin dp worlds actions [] = 0
Instances For
Value of Sample Information #
Optimal action: the one with highest expected utility.
Noncomputable because it extracts a witness from an existential
(Finset.exists_max_image via Classical.choice).
Equations
- Core.DecisionTheory.optimalAction dp actions = if h : actions.Nonempty then some ⋯.choose else none
Instances For
VSI(C) = V(D|C) - EU(a⁰|C): the value of sample information from learning proposition C, where a⁰ is the current optimal action.
Unlike UV(C) = V(D|C) - V(D), VSI is always non-negative: learning C before choosing can never hurt relative to the current best action applied within C. @cite{van-rooy-2003}, p. 742.
Equations
- One or more equations did not get rendered due to their size.
Instances For
EVSI(Q) = Σ P(C) · VSI(C): the expected value of sample information from asking question Q. @cite{van-rooy-2003}, p. 742.
Equations
- Core.DecisionTheory.expectedVSI dp actions cells = ∑ cell ∈ cells, Core.DecisionTheory.cellProbability dp cell * Core.DecisionTheory.valueSampleInfo dp actions cell
Instances For
EUV = EVSI #
EUV(Q) = EVSI(Q): the expected utility value of a question equals its expected value of sample information.
@cite{van-rooy-2003}, p. 742: "The expected utility value of a question is equal to its expected value of sample information."
The two hypotheses are the law of total expectation (Σ P(C)·EU(a|C) = EU(a)
for all actions) and cell probability normalization (Σ P(C) = 1).
Maximin Monotonicity #
Security level and maximin value are anti-monotone in the world set: restricting to a subset can only improve worst-case guarantees. These lemmas support the Blackwell maximin forward theorem.
Security level is ≤ the utility of any world in the finset.
Security level on a subset ≥ security level on the superset.
min over fewer elements ≥ min over more elements.
Maximin value on a subset ≥ maximin value on the superset.
Since securityLevel increases on subsets (fewer worst cases), maximinValue
(max over actions of security levels) also increases.
Maximin utility value is monotone under cell containment: learning a more specific proposition (subset of worlds) gives higher MUV.
Maximin value of information is non-negative for nonempty cells.
When the cell is nonempty, worlds ∩ c ⊆ worlds, and the maximin over a
subset considers fewer worst cases, so maximinAfterLearning ≥ maximinValue.
The question maximin value is ≤ MUV of each cell in the question.
Special Decision Problems #
An epistemic DP where the agent wants to know the exact world state.
Equations
- Core.DecisionTheory.epistemicDP target = { utility := fun (w : W) (x : A) => if (w == target) = true then 1 else 0, prior := fun (x : W) => 1 }
Instances For
A complete-information DP where only exact-state knowledge is useful.
Equations
- Core.DecisionTheory.completeInformationDP = { utility := fun (w a : W) => if (a == w) = true then 1 else 0, prior := fun (x : W) => 1 }
Instances For
A mention-some DP: any satisfier resolves the problem.
Equations
- Core.DecisionTheory.mentionSomeDP satisfies = { utility := fun (w : W) (a : Bool) => if (a && satisfies w) = true then 1 else 0, prior := fun (x : W) => 1 }
Instances For
Binary Question Value Decomposition #
For a binary partition [P, ¬P], the probability-weighted sum of conditional DP values equals Van Rooy's question utility plus the baseline DP value. This is the structural identity connecting "the value of asking a yes/no question" to the decision-theoretic question framework of @cite{van-rooy-2003}.
Cell probabilities of a binary partition [P, ¬P] sum to 1 when the prior is a proper distribution.
Binary question value decomposition: for a binary partition {P, ¬P},
Σ P(cell) · V(D|cell) = EUV({P,¬P}, D) + V(D)
where the LHS is the probability-weighted sum of conditional DP values,
EUV is questionUtility, and V(D) is dpValue.
Answer and Question Orderings #
@cite{van-rooy-2003}'s relevance-based orderings on answers and questions.
C >_Q D: answer C is strictly more relevant than D given question Q.
Equations
- Core.DecisionTheory.moreRelevantAnswer dp actions c d = decide (Core.DecisionTheory.utilityValue dp actions c > Core.DecisionTheory.utilityValue dp actions d)
Instances For
Q > Q': question Q is strictly better than Q'.
Equations
- Core.DecisionTheory.betterQuestion dp actions q q' = decide (Core.DecisionTheory.questionUtility dp actions q > Core.DecisionTheory.questionUtility dp actions q')