PHOIBLE 2.0 schema #
@cite{moran-mccloy-2019}
Substrate types for the PHOIBLE 2.0 phonological inventory database (Moran & McCloy 2019, http://phoible.org, DOI 10.5281/zenodo.2626687).
PHOIBLE 2.0 contains 3000+ inventories (doculects) covering 2100+ ISO 639-3
language codes. Each entry is a (languoid × phoneme) pair with distinctive-
feature decomposition. The CSV data/phoible-2019/phoible.csv has 49
columns and ~105K rows.
This Lean substrate mirrors the row-level structure: an Inventory is a
list of Phonemes, each carrying a FeatureMatrix of 37 PHOIBLE features.
Per-inventory data lives in Data/PHOIBLE/Inventories/{Lang}.lean,
auto-generated by scripts/gen_phoible.py.
Relation to WALS / Maddieson 2013 #
PHOIBLE 2.0 is the field-canonical successor to UPSID and to WALS Maddieson
chapters 1–19 for inventory facts: ~3000 inventories vs. WALS's ~563
languages, with the original transcribed inventory preserved (rather than
collapsed into Maddieson's 5-bin partition). The two are complementary:
WALS gives partition-based typology; PHOIBLE gives the underlying
inventories. Bridge theorems WALS↔PHOIBLE (e.g. WALS Ch 1 inventory-size
bin matches PHOIBLE inventory cardinality) live in
Phenomena/Phonology/Studies/Maddieson2013.lean (when written).
Citation #
Moran, Steven & McCloy, Daniel (eds.) 2019. PHOIBLE 2.0. Jena: Max Planck Institute for the Science of Human History. http://phoible.org. DOI: 10.5281/zenodo.2626687
A PHOIBLE distinctive-feature value: present (+), absent (-), or
not applicable (0). PHOIBLE encodes underspecification with 0,
not by omission.
- plus : FeatureValue
- minus : FeatureValue
- zero : FeatureValue
Instances For
Equations
- Data.PHOIBLE.instDecidableEqFeatureValue x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
Equations
- Data.PHOIBLE.instBEqFeatureValue.beq x✝ y✝ = (x✝.ctorIdx == y✝.ctorIdx)
Instances For
Equations
- Data.PHOIBLE.instReprFeatureValue = { reprPrec := Data.PHOIBLE.instReprFeatureValue.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
PHOIBLE segment class (the SegmentClass column).
- consonant : SegmentClass
- vowel : SegmentClass
- tone : SegmentClass
Instances For
Equations
- Data.PHOIBLE.instDecidableEqSegmentClass x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- Data.PHOIBLE.instBEqSegmentClass.beq x✝ y✝ = (x✝.ctorIdx == y✝.ctorIdx)
Instances For
Equations
Equations
- Data.PHOIBLE.instReprSegmentClass = { reprPrec := Data.PHOIBLE.instReprSegmentClass.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
PHOIBLE source database (the Source column). PHOIBLE aggregates
nine donor databases.
- spa : Source
Stanford Phonology Archive.
- upsid : Source
UCLA Phonological Segment Inventory Database.
- aa : Source
Alphabets of Africa (Hartell 1993, Chanard 2006).
- gm : Source
Sounds of the World's Languages (Maddieson & Disner 1984).
- ph : Source
Christopher Green & Steven Moran's African inventories.
- ra : Source
Ramaswami's Indian languages collection (1999).
- saphon : Source
South American Phonological Inventory Database (Michael et al.).
- ea : Source
Eurasia (Nikolaev).
- er : Source
Eric Round's Australian.
Instances For
Equations
- Data.PHOIBLE.instDecidableEqSource x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- Data.PHOIBLE.instBEqSource = { beq := Data.PHOIBLE.instBEqSource.beq }
Equations
- Data.PHOIBLE.instBEqSource.beq x✝ y✝ = (x✝.ctorIdx == y✝.ctorIdx)
Instances For
Equations
- Data.PHOIBLE.instReprSource.repr Data.PHOIBLE.Source.spa prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Data.PHOIBLE.Source.spa")).group prec✝
- Data.PHOIBLE.instReprSource.repr Data.PHOIBLE.Source.upsid prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Data.PHOIBLE.Source.upsid")).group prec✝
- Data.PHOIBLE.instReprSource.repr Data.PHOIBLE.Source.aa prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Data.PHOIBLE.Source.aa")).group prec✝
- Data.PHOIBLE.instReprSource.repr Data.PHOIBLE.Source.gm prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Data.PHOIBLE.Source.gm")).group prec✝
- Data.PHOIBLE.instReprSource.repr Data.PHOIBLE.Source.ph prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Data.PHOIBLE.Source.ph")).group prec✝
- Data.PHOIBLE.instReprSource.repr Data.PHOIBLE.Source.ra prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Data.PHOIBLE.Source.ra")).group prec✝
- Data.PHOIBLE.instReprSource.repr Data.PHOIBLE.Source.saphon prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Data.PHOIBLE.Source.saphon")).group prec✝
- Data.PHOIBLE.instReprSource.repr Data.PHOIBLE.Source.ea prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Data.PHOIBLE.Source.ea")).group prec✝
- Data.PHOIBLE.instReprSource.repr Data.PHOIBLE.Source.er prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Data.PHOIBLE.Source.er")).group prec✝
Instances For
Equations
- Data.PHOIBLE.instReprSource = { reprPrec := Data.PHOIBLE.instReprSource.repr }
PHOIBLE 37-feature distinctive-feature matrix. Each value is +/-/0
per the PHOIBLE feature mapping tables. The feature inventory follows
the PHOIBLE-Hayes feature system (cf. phoible/dev/raw-data/FEATURES/).
- syllabic : FeatureValue
- short_ : FeatureValue
- long_ : FeatureValue
- consonantal : FeatureValue
- sonorant : FeatureValue
- continuant : FeatureValue
- delayedRelease : FeatureValue
- approximant : FeatureValue
- tap : FeatureValue
- trill : FeatureValue
- nasal : FeatureValue
- lateral : FeatureValue
- labial : FeatureValue
- round_ : FeatureValue
- labiodental : FeatureValue
- coronal : FeatureValue
- anterior : FeatureValue
- distributed : FeatureValue
- strident : FeatureValue
- dorsal : FeatureValue
- high : FeatureValue
- low : FeatureValue
- front : FeatureValue
- back : FeatureValue
- tense : FeatureValue
- retractedTongueRoot : FeatureValue
- advancedTongueRoot : FeatureValue
- periodicGlottalSource : FeatureValue
- epilaryngealSource : FeatureValue
- spreadGlottis : FeatureValue
- constrictedGlottis : FeatureValue
- fortis : FeatureValue
- lenis : FeatureValue
- raisedLarynxEjective : FeatureValue
- loweredLarynxImplosive : FeatureValue
- click : FeatureValue
- tone : FeatureValue
The PHOIBLE-flagged tone feature (column "tone").
- stress : FeatureValue
The PHOIBLE-flagged stress feature (column "stress").
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Data.PHOIBLE.instReprFeatureMatrix = { reprPrec := Data.PHOIBLE.instReprFeatureMatrix.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
A single PHOIBLE phoneme entry: IPA glyph + feature matrix + metadata.
- glyph : String
IPA representation (the
Phonemecolumn). - glyphId : String
Hex-encoded Unicode codepoint sequence (the
GlyphIDcolumn). - allophones : List String
Allophones as a space-separated IPA list (the
Allophonescolumn). Empty whenNAin source. - marginal : Bool
Whether the segment is marginal in the inventory (the
Marginalcolumn:+→ true, NA → false). - segmentClass : SegmentClass
- features : FeatureMatrix
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Data.PHOIBLE.instReprPhoneme = { reprPrec := Data.PHOIBLE.instReprPhoneme.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
A PHOIBLE inventory: one row per (inventory × phoneme) collapsed into one struct per inventory.
- id : Nat
PHOIBLE InventoryID (unique per inventory).
- glottocode : String
Glottolog code.
- iso : String
ISO 639-3 code (may be empty for some doculects).
- languageName : String
- specificDialect : String
Specific dialect, if recorded; empty when
NAin source. - source : Source
- phonemes : List Phoneme
The inventory's full phoneme list.
Instances For
Equations
- Data.PHOIBLE.instReprInventory = { reprPrec := Data.PHOIBLE.instReprInventory.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Number of phonemes in an inventory.
Instances For
Number of consonants in an inventory.
Equations
- inv.consonantCount = (List.filter (fun (x : Data.PHOIBLE.Phoneme) => x.segmentClass == Data.PHOIBLE.SegmentClass.consonant) inv.phonemes).length
Instances For
Number of vowels in an inventory.
Equations
- inv.vowelCount = (List.filter (fun (x : Data.PHOIBLE.Phoneme) => x.segmentClass == Data.PHOIBLE.SegmentClass.vowel) inv.phonemes).length
Instances For
Number of distinct tones in an inventory.
Equations
- inv.toneCount = (List.filter (fun (x : Data.PHOIBLE.Phoneme) => x.segmentClass == Data.PHOIBLE.SegmentClass.tone) inv.phonemes).length