Documentation

Linglib.Data.PHOIBLE.Schema

PHOIBLE 2.0 schema #

@cite{moran-mccloy-2019}

Substrate types for the PHOIBLE 2.0 phonological inventory database (Moran & McCloy 2019, http://phoible.org, DOI 10.5281/zenodo.2626687).

PHOIBLE 2.0 contains 3000+ inventories (doculects) covering 2100+ ISO 639-3 language codes. Each entry is a (languoid × phoneme) pair with distinctive- feature decomposition. The CSV data/phoible-2019/phoible.csv has 49 columns and ~105K rows.

This Lean substrate mirrors the row-level structure: an Inventory is a list of Phonemes, each carrying a FeatureMatrix of 37 PHOIBLE features. Per-inventory data lives in Data/PHOIBLE/Inventories/{Lang}.lean, auto-generated by scripts/gen_phoible.py.

Relation to WALS / Maddieson 2013 #

PHOIBLE 2.0 is the field-canonical successor to UPSID and to WALS Maddieson chapters 1–19 for inventory facts: ~3000 inventories vs. WALS's ~563 languages, with the original transcribed inventory preserved (rather than collapsed into Maddieson's 5-bin partition). The two are complementary: WALS gives partition-based typology; PHOIBLE gives the underlying inventories. Bridge theorems WALS↔PHOIBLE (e.g. WALS Ch 1 inventory-size bin matches PHOIBLE inventory cardinality) live in Phenomena/Phonology/Studies/Maddieson2013.lean (when written).

Citation #

Moran, Steven & McCloy, Daniel (eds.) 2019. PHOIBLE 2.0. Jena: Max Planck Institute for the Science of Human History. http://phoible.org. DOI: 10.5281/zenodo.2626687

A PHOIBLE distinctive-feature value: present (+), absent (-), or not applicable (0). PHOIBLE encodes underspecification with 0, not by omission.

Instances For
    @[implicit_reducible]
    Equations
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      PHOIBLE segment class (the SegmentClass column).

      Instances For
        @[implicit_reducible]
        Equations
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          PHOIBLE source database (the Source column). PHOIBLE aggregates nine donor databases.

          • spa : Source

            Stanford Phonology Archive.

          • upsid : Source

            UCLA Phonological Segment Inventory Database.

          • aa : Source

            Alphabets of Africa (Hartell 1993, Chanard 2006).

          • gm : Source

            Sounds of the World's Languages (Maddieson & Disner 1984).

          • ph : Source

            Christopher Green & Steven Moran's African inventories.

          • ra : Source

            Ramaswami's Indian languages collection (1999).

          • saphon : Source

            South American Phonological Inventory Database (Michael et al.).

          • ea : Source

            Eurasia (Nikolaev).

          • er : Source

            Eric Round's Australian.

          Instances For
            @[implicit_reducible]
            Equations
            Equations
            Instances For
              def Data.PHOIBLE.instReprSource.repr :
              SourceNatStd.Format
              Equations
              Instances For
                @[implicit_reducible]
                Equations

                PHOIBLE 37-feature distinctive-feature matrix. Each value is +/-/0 per the PHOIBLE feature mapping tables. The feature inventory follows the PHOIBLE-Hayes feature system (cf. phoible/dev/raw-data/FEATURES/).

                Instances For
                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For
                    def Data.PHOIBLE.instDecidableEqFeatureMatrix.decEq (x✝ x✝¹ : FeatureMatrix) :
                    Decidable (x✝ = x✝¹)
                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      A single PHOIBLE phoneme entry: IPA glyph + feature matrix + metadata.

                      • glyph : String

                        IPA representation (the Phoneme column).

                      • glyphId : String

                        Hex-encoded Unicode codepoint sequence (the GlyphID column).

                      • allophones : List String

                        Allophones as a space-separated IPA list (the Allophones column). Empty when NA in source.

                      • marginal : Bool

                        Whether the segment is marginal in the inventory (the Marginal column: + → true, NA → false).

                      • segmentClass : SegmentClass
                      • features : FeatureMatrix
                      Instances For
                        def Data.PHOIBLE.instReprPhoneme.repr :
                        PhonemeNatStd.Format
                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For
                          def Data.PHOIBLE.instDecidableEqPhoneme.decEq (x✝ x✝¹ : Phoneme) :
                          Decidable (x✝ = x✝¹)
                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For

                            A PHOIBLE inventory: one row per (inventory × phoneme) collapsed into one struct per inventory.

                            • id : Nat

                              PHOIBLE InventoryID (unique per inventory).

                            • glottocode : String

                              Glottolog code.

                            • iso : String

                              ISO 639-3 code (may be empty for some doculects).

                            • languageName : String
                            • specificDialect : String

                              Specific dialect, if recorded; empty when NA in source.

                            • source : Source
                            • phonemes : List Phoneme

                              The inventory's full phoneme list.

                            Instances For
                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For
                                def Data.PHOIBLE.instDecidableEqInventory.decEq (x✝ x✝¹ : Inventory) :
                                Decidable (x✝ = x✝¹)
                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  Number of phonemes in an inventory.

                                  Equations
                                  Instances For

                                    Number of consonants in an inventory.

                                    Equations
                                    Instances For

                                      Number of vowels in an inventory.

                                      Equations
                                      Instances For

                                        Number of distinct tones in an inventory.

                                        Equations
                                        Instances For