Documentation

Linglib.Studies.Cinque2005

Cinque 2005: Deriving Greenberg's Universal 20 [Cin05] #

[Cin05] derives the cross-linguistic typology of the four DP-internal elements demonstrative, numeral, adjective, noun from a single universal Merge order plus constrained phrasal movement:

Of the 4! = 24 logically possible orders, exactly 14 are attested and all 14 are derivable; the 10 unattested are exactly the underivable ones (each would require a wrong Merge order of the modifiers).

What this file establishes #

This study runs Cinque's derivations on the single MCB SO carrier (Syntax/Minimalist/SyntacticObject/Derivation.lean): each order is an SO.Derivation (fixed-order External Merge to build the base, then SO.Step.im leftward movements), and the surface order is read off by the computable SO.Derivation.surfaceCats (derivation-grounded externalization — MCB σ_L, not Quot.out). The orders are verified by decide. Internal Merge is index-free (SO.Step.im mover, no trace id); the trace it leaves is the bare SO.traceLeaf, and pied-piping movers (which contain an earlier trace) are built planar-first via the computable DSL (SO.ofPlanar/SO.nodeP/SO.leafP/SO.traceP).

Eight of the fourteen attested orders are derived explicitly here (the bare-NP-raising series a–d and the pied-piping cases n, o, t, x), demonstrating the mechanism end-to-end on the substrate. The full enumeration of the legal derivation space proves reachable = exactly the 14 attested (so the 10 unattested are underivable: u20_reachable_iff_attested). The markedness-tracks-frequency result ([Cin05] (7b)) is the final section.

Encoding note #

The substrate Cat enum has .Num, .A, .N but no dedicated demonstrative constructor, so .D encodes the demonstrative slot here. A dedicated Cat.Dem (with its extended-projection F-level and ±V/±N features) is a deferred substrate refinement; it does not affect the combinatorial result, which only needs four distinct categories.

The four elements (head-initial leaves) #

Demonstrative (encoded as .D; see module note).

Equations
Instances For

    Frequency buckets and the 24-order attestation table ([Cin05] (6)) #

    Transcribed from the paper. rows are attested; * rows unattested. Frequency is the paper's cross-linguistic bucket (number of languages instantiating the order). .D stands for the demonstrative.

    Cross-linguistic frequency of an order ([Cin05] (6)).

    Instances For
      @[implicit_reducible]
      instance Cinque2005.instDecidableEqFreq :
      DecidableEq Freq
      Equations
      def Cinque2005.instReprFreq.repr :
      FreqStd.Format
      Equations
      Instances For
        @[implicit_reducible]
        Equations

        One row of Cinque's typology table: an order of {Dem,Num,A,N}, whether it is attested, and its frequency bucket.

        Instances For
          def Cinque2005.instDecidableEqOrderRow.decEq (x✝ x✝¹ : OrderRow) :
          Decidable (x✝ = x✝¹)
          Equations
          • One or more equations did not get rendered due to their size.
          Instances For
            def Cinque2005.instReprOrderRow.repr :
            OrderRowStd.Format
            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              [Cin05] table (6), rows a–x.

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For
                theorem Cinque2005.attested_count :
                (List.filter (fun (x : OrderRow) => x.attested) table).length = 14

                [Cin05]: exactly 14 of the 24 orders are attested.

                theorem Cinque2005.unattested_count :
                (List.filter (fun (x : OrderRow) => decide ¬x.attested = true) table).length = 10

                The 10 unattested orders.

                theorem Cinque2005.table_freq_consistent :
                (table.all fun (r : OrderRow) => decide ((r.attested = true) = (r.freq Freq.unattested))) = true

                Every attested order is .unattested-free; every unattested order has freq .unattested (table internal consistency).

                Derivations on the MCB substrate #

                Built bottom-up: initial := noun, then head-initial External Merge of A, Num, Dem (emL), interleaved with SO.Step.im leftward raising. surfaceCats reads off the order via the computable derivation-grounded externalization.

                Movers: a bare NP raise passes noun; a pied-piping raise passes the NP-containing constituent (which contains the earlier SO.traceLeaf), built with the planar DSL so the surface orders decide.

                The pied-piped [N [A t]] constituent raised again past Num, with the Num sub-trace: [N [Num [t [A t]]]] — the (t) roll-up's third mover.

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  The successive pied-pipe past Num for (x): [[N [A t]] [Num t]].

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    (b) Dem Num N A — NP raises around A (bare).

                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      (c) Dem N Num A — NP raises around A then Num (partial, bare).

                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For

                        (d) N Dem Num A — NP raises around A, Num, Dem (total, bare).

                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For

                          (n) Dem A N Num — pied-pipe [A N] around Num (no sub-raise).

                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For

                            (o) Dem N A Num — raise N around A, then pied-pipe [N A] around Num.

                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For

                              (t) N Num A Dem — bare raise around A and Num, then pied-pipe [N Num A] around Dem.

                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For

                                (x) N A Num Dem — successive pied-piping all the way up (the roll-up).

                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  The eight derivations produce their attested orders (kernel-checked) #

                                  theorem Cinque2005.derived_orders_attested :
                                  ([derA, derB, derC, derD, derN, derO, derT, derX].all fun (d : Minimalist.SO.Derivation) => (List.filter (fun (x : OrderRow) => x.attested) table).any fun (x : OrderRow) => decide (x.order = d.surfaceCats)) = true

                                  Each of the eight derived orders is an attested order in Cinque's table (the constructive half of Universal 20: these orders arise by NP-movement from the universal base).

                                  The restrictive result: reachable orders = exactly the 14 attested #

                                  The constructive theorems above derive specific attested orders. Here we enumerate the whole legal derivation space and show its surface orders are exactly the 14 attested ones — so the 10 unattested orders are underivable (each would require a "wrong Merge order"; [Cin05], cf. Abels & Neeleman 2012's leftward-movement characterization).

                                  The enumeration runs on the substrate's ordered planar externalization type RoseTree SOLabel (what the externalization replay produces) with planarYield (the substrate's planar yield, traces dropped). Building the base bottom-up, at each of the three specs (around A, Num, Dem) we optionally raise an N-containing proper subtree to the left edge — leftward movement of an N-containing constituent, no remnant movement. Fully computable; decide-checked.

                                  The distinct surface orders reachable by Cinque's movement.

                                  Equations
                                  • One or more equations did not get rendered due to their size.
                                  Instances For

                                    The 14 attested orders.

                                    Equations
                                    Instances For

                                      Exactly 14 orders are reachable.

                                      theorem Cinque2005.u20_reachable_iff_attested :
                                      (table.all fun (r : OrderRow) => decide (decide (r.order reachableOrders) = r.attested)) = true

                                      Universal 20 (restrictive core): an order is reachable by Cinque's constrained NP-movement iff it is attested — over all 24 orders. The 14 attested are derivable; the 10 unattested are underivable.

                                      theorem Cinque2005.reachable_eq_attested :
                                      (reachableOrders.all fun (x : List Minimalist.Cat) => decide (x attestedOrders)) = true (attestedOrders.all fun (x : List Minimalist.Cat) => decide (x reachableOrders)) = true

                                      The reachable set is exactly the attested set.

                                      Markedness tracks frequency ([Cin05] (7b), (6)) #

                                      Cinque's (7b) ranks the movement parameters: no movement, whose-picture pied-piping, and total movement are unmarked; partial movement, movement without pied-piping, and picture-of-who pied-piping are marked. Each attested order's cheapest derivation has a count of marked options, and the claim is that this count tracks cross-linguistic frequency.

                                      Why markedOptions is transcribed, not derived. Cinque's count is not a locally-compositional function of the derivation: whether a given step is marked depends on the whole derivation. The sharpest case is the first bare NP-raise around the lowest modifier A — unmarked when it begins the roll-up (6x) (vacuous whose-picture pied-piping, since the stranded modifiers are picked up later) but marked ("without pied-piping") in (6c), where they never are. The same step is marked or not depending on global derivation shape. So markedOptions transcribes the holistic per-order analysis (6a–6x) — paper data, like the attestation table — and markedMoves below is a genuinely derived local proxy that we prove is insufficient (it neither matches Cinque's count nor predicts frequency), justifying the transcription.

                                      Caveat — these counts are not mechanically verified against Cinque. They are read from his prose; the decide theorems check only internal consistency and the frequency correlation, not fidelity (re-reading is how the original (6b) error was caught, not a theorem). Reverse-engineering a rule — (# distinct marked pied-piping parameters) + (1 if partial) — reproduces 11–12 of the 14, but diverges at (6k) (rule 3, Cinque 2: he does not add the partial penalty) and (6w) (rule 2, Cinque 1; hedged in fn 28). Whether those two are the rule's limitation or a transcription artifact is open — a fully derived markedness (computing the count from the enumerated derivation histories, not just surface forms) would settle it by making any disagreement with the paper an explicit, checkable claim.

                                      def Cinque2005.markedOptions :
                                      List Minimalist.CatOption

                                      Cinque's holistic marked-option count for each attested order, transcribed from the per-order analysis (6a–6x) ([Cin05] (7b)); none for the 10 unattested (no derivation). (6p) is the "especially marked"/possibly-spurious case (fn 27). Not locally derivable — see markedMoves.

                                      Equations
                                      Instances For
                                        theorem Cinque2005.markedOptions_isSome_iff_attested :
                                        (table.all fun (r : OrderRow) => decide ((markedOptions r.order).isSome = r.attested)) = true

                                        Markedness is defined exactly on the derivable orders.

                                        theorem Cinque2005.markedness_extremes :
                                        ((List.filter (fun (x : OrderRow) => x.attested) table).all fun (r : OrderRow) => decide ((markedOptions r.order = some 0 r.freq = Freq.veryMany) (markedOptions r.order = some 2r.freq = Freq.few r.freq = Freq.veryFew))) = true

                                        Markedness predicts frequency at the extremes ([Cin05] (6)–(7)): an order is derivable with no marked option iff it is the most frequent (very many) — orders (6a), (6x) — and any order needing two marked options is rare (few / very few). The one-marked-option middle is mixed (Universal 20's residual exceptions).

                                        A derived local proxy — and why it is insufficient #

                                        markedMoves is computed from the substrate: the minimum, over the enumerated derivations producing an order, of the number of locally marked moves (a raise of a bare NP or of a picture-of-who constituent is marked; a whose-picture raise — NP leftmost in the moved constituent — is unmarked). This is what a naive compositional reading of (7b) gives. The two theorems below show it is not Cinque's count and does not track frequency.

                                        def Cinque2005.markedMoves (ord : List Minimalist.Cat) :
                                        Option

                                        Derived local proxy: the minimum number of locally-marked moves over the derivations producing ord (none if unreachable).

                                        Equations
                                        • One or more equations did not get rendered due to their size.
                                        Instances For

                                          The derived local proxy disagrees with Cinque's holistic count: the roll-up (6x) has one locally-marked move (its first bare N-raise), yet Cinque counts it unmarked (that raise is vacuous pied-piping, resolved by later raises).

                                          The local proxy does not predict frequency: (6x) and (6c) both have one locally-marked move, but (6x) is the most frequent (very many) and (6c) the rarest (very few). Cinque's frequency correlation needs the holistic markedness, not a local move-count — hence markedOptions is transcribed.

                                          TODO (follow-up) #

                                          A Cat.Dem constructor (replacing the .D stand-in) once its extended-projection F-level / ±V±N features are settled — a substrate refinement orthogonal to the combinatorial and markedness results above.