Communication Channels #

@cite{cover-thomas-2006} @cite{shannon-1948} @cite{zaslavsky-etal-2019}

A CommChannel C W is a finite-alphabet stochastic conditional distribution p(w | c) — the basic Shannon channel restricted to finite input/output types. Used as substrate by:

ChannelCapacity.lean — capacity, CAP, mutual-information bounds
Theories.Pragmatics.AsymmetricCommunication — speaker-listener asymmetric setups (variation, iterated learning)
Phenomena.Color.Studies.ZaslavskyEtAl2019 — color naming
Theories.Diachronic.Lexicalization — listener model conditioned on L
RSA literal speaker S₀ (the channel) and L₀ (the posterior)

The derived quantities (marginalWord, posterior, commPrecision, mutualInfo) live here because they are channel-and-prior functions of purely Shannon character. Capacity-specific theorems live in the sibling ChannelCapacity.lean.

Main definitions #

CommChannel: row-stochastic matrix p(w | c)
marginalWord: p(w) = Σ_c p(c) · p(w|c)
posterior: p(c|w) via Bayes' rule
commPrecision: expected surprisal S(c) = -Σ_w p(w|c) · log p(c|w)
mutualInfo: I(W;C) = Σ_{c,w} p(c) · p(w|c) · log(p(c|w)/p(c))

source

structure Pragmatics.InformationTheory.CommChannel (C W : Type) [Fintype C] [Fintype W] :

Type

A finite-alphabet communication channel: a row-stochastic conditional encode c w = p(w | c). The Shannon-channel primitive shared by information-theoretic and pragmatic communication models.

Originally NamingChannel in @cite{zaslavsky-etal-2019}; lifted here because the same primitive serves color-naming, lexicalization, asymmetric-lexicon models, and RSA literal-speaker semantics.

encode : C → W → ℝ
p(w|c): probability of word w given meaning c.
encode_nonneg (c : C) (w : W) : 0 ≤ self.encode c w
encode_sum_one (c : C) : ∑ w : W, self.encode c w = 1

Instances For

source

noncomputable def Pragmatics.InformationTheory.marginalWord {C W : Type} [Fintype C] [Fintype W] (nc : CommChannel C W) (prior : C → ℝ) (w : W) :

ℝ

Marginal word probability p(w) = Σ_c p(c) · p(w|c).

Equations

Pragmatics.InformationTheory.marginalWord nc prior w = ∑ c : C, prior c * nc.encode c w

Instances For

source

noncomputable def Pragmatics.InformationTheory.posterior {C W : Type} [Fintype C] [Fintype W] (nc : CommChannel C W) (prior : C → ℝ) (w : W) (c : C) :

ℝ

Posterior p(c|w) via Bayes' rule. The listener's belief about the meaning after hearing word w (≡ RSA literal listener L₀).

Equations

Pragmatics.InformationTheory.posterior nc prior w c = nc.encode c w * prior c / Pragmatics.InformationTheory.marginalWord nc prior w

Instances For

source

noncomputable def Pragmatics.InformationTheory.commPrecision {C W : Type} [Fintype C] [Fintype W] (nc : CommChannel C W) (prior : C → ℝ) (c : C) :

ℝ

Communicative precision (expected surprisal) of meaning c: S(c) = -Σ_w p(w|c) · log p(c|w). Lower means the channel communicates c more precisely. Defined in @cite{zaslavsky-etal-2019}.

Equations

Pragmatics.InformationTheory.commPrecision nc prior c = -∑ w : W, nc.encode c w * Real.log (Pragmatics.InformationTheory.posterior nc prior w c)

Instances For

source

noncomputable def Pragmatics.InformationTheory.mutualInfo {C W : Type} [Fintype C] [Fintype W] (nc : CommChannel C W) (prior : C → ℝ) :

ℝ

Mutual information I(W;C) = Σ_{c,w} p(c) · p(w|c) · log(p(c|w)/p(c)).

Equations

Pragmatics.InformationTheory.mutualInfo nc prior = ∑ c : C, ∑ w : W, prior c * nc.encode c w * Real.log (Pragmatics.InformationTheory.posterior nc prior w c / prior c)

Instances For

Basic structural lemmas #

source

theorem Pragmatics.InformationTheory.CommChannel.encode_le_one {C W : Type} [Fintype C] [Fintype W] (nc : CommChannel C W) (c : C) (w : W) :

nc.encode c w ≤ 1

Each encode probability is at most 1 (from row-stochastic constraint).

source

theorem Pragmatics.InformationTheory.marginalWord_nonneg {C W : Type} [Fintype C] [Fintype W] (nc : CommChannel C W) (prior : C → ℝ) (hp : ∀ (c : C), 0 ≤ prior c) (w : W) :

0 ≤ marginalWord nc prior w

Marginal word probability is non-negative under a non-negative prior.

source

theorem Pragmatics.InformationTheory.marginalWord_sum_one {C W : Type} [Fintype C] [Fintype W] (nc : CommChannel C W) (prior : C → ℝ) (hsum : ∑ c : C, prior c = 1) :

∑ w : W, marginalWord nc prior w = 1

The marginal word distribution sums to 1 under a normalized prior.

source

theorem Pragmatics.InformationTheory.marginalWord_pos_of {C W : Type} [Fintype C] [Fintype W] (nc : CommChannel C W) (prior : C → ℝ) (hp : ∀ (c : C), 0 ≤ prior c) {c : C} {w : W} (hpc : 0 < prior c) (hew : 0 < nc.encode c w) :

0 < marginalWord nc prior w

When prior c > 0 and encode c w > 0, the marginal p(w) > 0.

Documentation

Linglib.Theories.Pragmatics.InformationTheory.Channel

Communication Channels #

Main definitions #

Basic structural lemmas #