Documentation

Linglib.Phenomena.Phonology.Studies.Benua1997

Benua 1997 — Misapplication Unification #

@cite{benua-1997}

Benua's central empirical claim, defended across three case studies: overapplication (Sundanese nasal harmony) and underapplication (Tiberian Hebrew spirantization) of phonological processes in morphologically complex words are duals of one mechanism — high-ranked OO-Faithfulness forcing the derivative to mimic the base, even at the cost of marked structures in the derivative's surface form.

This file formalizes the misapplication-unification claim: the same TETRU-shaped ranking schema (M₁ ≫ OO-Ident ≫ M₂ ≫ IO-Faith) derives both kinds of misapplication, depending on whether the relevant markedness constraint M₂ is satisfied or violated by the base output.

What's here #

What's not here #

Architectural integration #

Consumes:

By construction, every claim about misapplication routes through the unifying Corr substrate. There is no separate Sundanese or Hebrew machinery — the cross-linguistic point of @cite{benua-1997} is that one mechanism explains both.

A minimal segmental inventory shared across both case studies, abstract enough to encode the relevant phonological contrasts:

  • Nasal vs. oral consonant (Sundanese trigger; Hebrew sonority)
  • Nasal vs. oral vowel (Sundanese harmony target)
  • Stop vs. spirant (Hebrew spirantization target)

Per Benua's analysis, the segmental representations abstract from the full IPA forms — what matters is the featural contrast at each position, not segment identity.

Instances For
    @[implicit_reducible]
    Equations
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Data #

      Singular: /ɲiar/ → [ɲĩãr] 'seek' Plural: /ɲ-ar-iar/ → [ɲ-ãl-ĩãr] 'seek (pl)'

      The data source is @cite{cohn-1990} on Sundanese nasal phonology; @cite{benua-1997} reanalyzes it in TCT terms.

      Canonical postnasal nasal harmony: vowels are nasalized iff they follow a nasal consonant. In the singular [ɲĩãr], the post-ɲ vowels [ĩ ã] are nasal as expected.

      In the plural [ɲ-ãl-ĩãr], the root vowels [ĩ ã] in derivative positions 3-4 remain nasal even though they are now post-l (an oral consonant) — overapplication. Without OO-Faith, plain canonical harmony would predict [ɲ-ãl-iar] (oral [i a] in the root). Note the infix vowel [ã] is nasal by canonical post-ɲ harmony — it is not the overapplication target.

      Singular base: /ɲiar/ → [ɲĩãr]. Encoded as [nasalC, nasalV, nasalV, oralC].

      Equations
      • One or more equations did not get rendered due to their size.
      Instances For

        Singular surface form (canonical postnasal nasal harmony).

        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          Plural input: /ɲ-ar-iar/ — root /ɲiar/ with infix /ar/ inserted after the nasal. Six segments: ɲ, a, r (infix), i, a, r (root).

          Equations
          • One or more equations did not get rendered due to their size.
          Instances For

            Overapplied plural surface form: [ɲ-ãl-ĩãr]. Position 1 (infix vowel) is nasal by canonical harmony (post-ɲ). Positions 3-4 (root vowels) are also nasal — this is the overapplication: they are post-l (oral) so canonical harmony would not spread to them, but OO-Faith to the base forces them to remain nasal.

            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Counterfactual non-overapplying surface form: [ɲ-ãl-iar]. Canonical harmony stops at the oral consonant l at position 2; positions 3-4 are NOT in post-nasal context so both root vowels are oral. This is the candidate TCT correctly rules out via OO-Ident.

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For

                Morphological OO-edge: maps base positions to their morphological correspondents in the derivative, not index-by-index. The infix -ar- is inserted between root-initial ɲ and the rest, so the alignment is:

                base[0] = ɲ  ↔  deriv[0] = ɲ
                base[1] = i  ↔  deriv[3] = i  (deriv positions 1, 2 are infix material)
                base[2] = a  ↔  deriv[4] = a
                base[3] = r  ↔  deriv[5] = r
                

                Standard correspondence-theoretic treatment of infixation: the OO relation respects morphological identity, not linear position.

                Equations
                Instances For

                  The TCT correspondence diagram for the overapplication candidate.

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    The TCT correspondence diagram for the non-overapplying candidate.

                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      The misapplication theorem (Sundanese case): the overapplied candidate strictly beats the canonical-harmony candidate on OO-Ident. Under TETRU ranking (OO-Ident ≫ *NASAL-AFTER-ORAL), overapplied wins.

                      Data #

                      Imperfect base: /yiktol/ → [yiqθol] 'kill (3sg.m.imperf)' — post-vocalic [t] spirantizes to [θ] by canonical rule. Truncated jussive: /yiktl/ → [yiqṭl] (NOT [yiqθl]) — the [t] in the cluster does NOT spirantize, even though now post-vocalic.

                      Without OO-Faith, the truncated form would canonically spirantize the post-vocalic [t]. Underapplication preserves paradigmatic identity to the imperfect base — but at the featural level: what's preserved is the [-continuant] value of the obstruent, not the segment identity per se. Benua argues this requires IDENT-[continuant]-OO, not segment-level OO-Ident.

                      This case study uses Corr.identViolFeature with a continuant projection to compute featural OO-IDENT, the constraint Benua actually appeals to.

                      Imperfect base input: /yiktol/. Position 2 is the relevant obstruent.

                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For

                        Imperfect surface form: [yiqθol] — post-vocalic [t] spirantizes to [θ]. Position 2 is spirant (i.e., [+cont]).

                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For

                          Truncated jussive input: /yiktl/.

                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For

                            Underapplied jussive: [yiqṭl] — the [t] FAILS to spirantize even though now post-vocalic. Position 2 is stop ([-cont]), preserving the underlying featural value. The empirical winner.

                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For

                              Canonical-spirantization jussive: [yiqθl] — what the canonical rule would predict. Position 2 is spirant ([+cont]). The candidate TCT correctly rules out — using featural OO-IDENT.

                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For

                                Morphological OO-edge for Hebrew: the truncation removes base position 3 (the syllable-nucleus vowel [o]); base positions 0, 1, 2, and 4 map to derivative positions 0, 1, 2, 3 respectively:

                                base[0] = y  ↔  deriv[0] = y
                                base[1] = i  ↔  deriv[1] = i
                                base[2] = θ/t ↔ deriv[2] = θ/t  (the spirantization site)
                                base[4] = l  ↔  deriv[3] = l    (base[3] = o deleted by truncation)
                                
                                Equations
                                Instances For

                                  TCT diagram for the (empirically winning) underapplied jussive.

                                  Equations
                                  • One or more equations did not get rendered due to their size.
                                  Instances For

                                    TCT diagram for the (empirically losing) canonical-spirantization jussive.

                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      Featural OO-IDENT: counts pairs (i, j) ∈ ooEdge where the [continuant] feature value differs between base[i] and derivative[j]. The constraint @cite{benua-1997} actually appeals to (Ch 4).

                                      For the underapplied candidate: base[2] = spirant ([+cont]), deriv[2] = stop ([-cont]) — featural mismatch. Other pairs preserve [continuant].

                                      For the canonical candidate: base[2] = spirant ([+cont]), deriv[2] = spirant ([+cont]) — featural match.

                                      Equations
                                      • One or more equations did not get rendered due to their size.
                                      Instances For

                                        The Hebrew featural-IDENT inversion: the empirically-correct underapplied candidate has more OO-IDENT-[cont] violations than the canonical-spirantization candidate. Under pure OO-IDENT-[cont] ≫ M₂, canonical would win — contradicting the empirical fact.

                                        The resolution (per @cite{benua-1997} Ch 4): the load-bearing constraint is IDENT-[continuant]-IO preserving the input stop, not OO-Ident. The TETRU schema for Hebrew puts IO-Ident-[cont] above OO-Ident, then *MAX-V (deletes [o]) above all the rest. This file formalizes the OO substrate; the full Hebrew tableau requires the additional IO-faith and *MAX-V machinery, deferred.

                                        The unified architectural claim of @cite{benua-1997}: both overapplication (Sundanese, §2) and underapplication (Tiberian Hebrew, §3) are formalized as the same construction — a 3-role TCT correspondence diagram with IDENT-OO (segmental or featural) on the (.base, .derivative) edge. The empirical contrast between the two languages reduces to which markedness constraint plays the M₂ role in the TETRU schema, and whether OO-Ident is segmental or featural.

                                        Under the TCT.TetruSchema substrate (`Theories/Phonology/OptimalityTheory/TCT.lean`):
                                        
                                        - **Sundanese**: M₂ = `*NASAL-AFTER-ORAL`. OO-Ident is segmental (or
                                          restricted to `[nasal]`). Overapplication = the misapplied
                                          candidate (deriv positions 3-4 nasal) strictly beats canonical.
                                        - **Tiberian Hebrew**: M₂ involves spirantization plus *MAX-V plus
                                          featural IO-Ident-[cont]. The OO-Ident comparison alone does not
                                          decide — full TETRU requires the extra constraints documented above.
                                        
                                        The shared structural mechanism: `TetruSchema.misapplication_wins`
                                        (in `TCT.lean`). 
                                        

                                        The polemic of Benua's thesis #

                                        @cite{benua-1997}'s central architectural argument is that parallel TCT subsumes the predictions of stratal/cyclic phonology for misapplication patterns, eliminating the need for cycles. The two architectures differ in how they explain misapplication, but converge on the same surface predictions for the canonical cases.

                                        This section formalizes the architectural comparison on Sundanese: both architectures predict the overapplied surface form [ɲ-ãl-ĩãr] for the plural, but for different structural reasons.

                                        The bridge: on Sundanese, both architectures agree on the surface form. The agreement is documented as an rfl-checkable claim about the architecturally-distinct derivations producing identical phraseOutput.

                                        A two-stratum stratal derivation of the Sundanese plural.

                                        • Stem cycle: input /ɲiar/ harmonized to [ɲĩãr] = Sundanese.baseOutput.
                                        • Word cycle: stem output combined with infix /-ar-/, then Word-stratum harmony produces [ɲ-ãl-ĩãr] = Sundanese.derivOutputOverapplied. Crucially, the Word stratum preserves the nasal vowels carried over from the Stem cycle (high-ranked IDENT-IO at Word level).
                                        • Phrase cycle: identity (no further phonological adjustment).

                                        The full derivation is encoded directly; the actual stratum-by-stratum OT evaluations are deferred (would require stratum-specific constraint rankings + GEN/EVAL machinery instantiated on Sundanese.Seg). The point of this section is the architectural agreement, not the grammar implementation.

                                        Equations
                                        • One or more equations did not get rendered due to their size.
                                        Instances For

                                          The stratal derivation expressed as a Corr StratalRole Seg. Now stratal and TCT analyses share a substrate type — the Corr family. Cross-stratum feeding edges (.sIn↔.sOut, .sOut↔.wOut, .wOut↔.pOut) carry parallel-pair correspondences.

                                          Equations
                                          • One or more equations did not get rendered due to their size.
                                          Instances For

                                            The stratal derivation projected down to TCT roles. The .sOut becomes TCT .base; the .pOut becomes TCT .derivative; the .wOut is folded out (TCT doesn't distinguish a separate word stratum).

                                            Equations
                                            • One or more equations did not get rendered due to their size.
                                            Instances For

                                              Bridge theorem (substrate level): the stratal derivation's phrase output, when projected to a TCT diagram, IS the TCT derivative form. True by rfl via project_preserves_phrase_as_derivative. The two architectures share their notion of "surface form" by construction once they are projected onto the same Corr substrate.

                                              Bridge theorem (Sundanese-specific): the stratal derivation and the directly-built TCT diagram (Sundanese.overappliedDiagram) agree on the surface form. Both produce Sundanese.derivOutputOverapplied — the empirical convergence claim of @cite{benua-1997}.

                                              Bridge theorem (architectural): the stratal stem output IS the TCT base. Benua's identification of "stratum-1 result = OO-base" is true by construction of the project function.

                                              Bridge theorem (substrate level): the OO-correspondence edge in the projected TCT diagram is the parallel-pair correspondence between the stratal stem output and phrase output. The formal content of "the OO-Faith of TCT replaces the cycle-chain of stratal": the OO edge is recovered from the stratal feeding chain by composing (.sOut, .wOut) and (.wOut, .pOut) into the direct (.base, .derivative) correspondence.

                                              What's not yet proved #

                                              This bridge is substrate-level (the stratal and TCT diagrams share the Corr family) and concrete (one paradigm). The full Benua claim — for every stratal analysis there exists a TCT analysis producing the same surface predictions — is a meta-theoretical statement requiring:

                                              1. A model of stratal grammars as functions Input → Output derived from stratum-specific constraint rankings (Stratal.evalStratum chained via chainEval).
                                              2. A model of TCT grammars as TCTGrammar instances (already exists in TCT.lean).
                                              3. A constructive translation stratalToTCT : StratalGrammar → TCTGrammar such that for all inputs, the derived surface forms agree — essentially: "TCT's OO-Faith with stratum-1-output as base = stratum-2's IDENT-IO with stratum-1-output as input."

                                              Step 3 is the load-bearing piece; @cite{benua-1997}'s polemic is that this translation exists (and is empirically supported by Sundanese, Tiberian Hebrew, and English). Formalizing the constructive translation — or finding a counterexample — is the next major scientific step. The substrate is now in place: any candidate translation would map through the Corr Role α type that both architectures now share.