Corbett (1991): Gender — typology of noun-class systems #
@cite{corbett-1991} @cite{corbett-2013} @cite{dryer-haspelmath-2013} @cite{dixon-1972}
Greville Corbett. Gender. Cambridge University Press, 1991. Plus @cite{corbett-2013}'s WALS Chs 30, 31, 32.
This study file holds Corbett's cross-linguistic generalisations on the
22-language sample. Per-language profiles live in
Fragments/{Lang}/Gender.lean as genderTypology : GenderProfile —
constructed via GenderProfile.fromWALS so WALS Chs 30/31/32 are
auto-pulled and the editorial fields (rawGenderCount, agreementTargets,
semanticBases, attestedSurfaceGenders) are local per-language commitments.
Sample composition #
22 languages chosen to span all five GenderCount values:
- No gender (English, Mandarin, Japanese, Turkish, Finnish, Korean, Quechua — 7 languages).
- 2 genders (French, Spanish, Hindi-Urdu, Irish, Hebrew, Hausa — 6 languages).
- 3 genders (German, Russian, Latin, Romanian — 4 languages).
- 4 genders (Dyirbal, Georgian — 2 languages).
- 5+ noun classes (Swahili, Zulu, Fula — 3 languages).
The sample includes Bantu noun-class systems (Swahili, Zulu) and Fula (20+ classes), the Australian 4-class system Dyirbal, and the Caucasian rationality-based Georgian system, alongside the canonical European sex-based 2/3 systems.
What this file deliberately omits #
Aggregate-count theorems (sample_X_count = N, gender_scale_range,
sample_diversity) — these go stale every time a Fragment is added to
the sample and were the "aggregate-count theorems" anti-pattern. The
substantive Corbett 1991 generalisations (Agreement Hierarchy properties,
no-purely-formal claim, basis × count interactions) are kept; ISO sanity
remains as a drift sentry.
Per-language profiles drawn from Fragments/{Lang}/Gender.lean via
GenderProfile.fromWALS. Aliases here for concise reference below.
Three languages are record-updated to match Corbett's 1991 monograph
where it diverges from his later WALS chapters @cite{corbett-2013}.
The divergences are first-class theorems in §9 below — exposing them
rather than hiding them inside the Fragment.
Corbett-1991 record-overrides for the 3 languages where Corbett's 1991 book disagrees with his 2013 WALS chapters.
All 22 language profiles in the Corbett 1991 sample. English, Dyirbal,
and Georgian are the record-overridden versions; the WALS originals
are at Fragments.{English,Dyirbal,Georgian}.Gender.genderTypology.
Equations
- One or more equations did not get rendered due to their size.
Instances For
All raw gender counts are consistent with their WALS bins.
All profiles are cross-chapter consistent.
All five GenderCount values are attested.
All three GenderBasis values are attested.
All three AssignmentSystem values are attested.
All 2- and 3-gender systems in the sample are sex-based. Reflects the cross-linguistic pattern that small gender systems organise around sex.
@cite{corbett-1991}'s key finding: no language assigns gender on a purely formal basis without any semantic core. In the sample, every gendered language has semantic assignment (alone or combined with formal).
All 3-gender systems in the sample use semantic + formal assignment. Sex-based 3-gender systems require formal correlates because the masculine/feminine/neuter distinction can't be determined by semantics alone.
A gender system without any agreement is not a gender system — genders are precisely the categories that trigger agreement.
@cite{corbett-1991}'s Agreement Hierarchy: attributive > predicate > relative pronoun > personal pronoun > verb.
Verb agreement implies higher-target agreement in the sample (none of the languages agree only on verbs).
No language in the sample agrees only on verbs.
No gendered language in the sample has agreement only on personal pronouns: pronoun agreement always co-occurs with at least one other target.
Noun-class systems (5+) show agreement on more targets than smaller systems. In the sample, all noun-class systems agree on ≥4 of the 5 target types.
Non-sex-based systems in the sample have ≥4 genders. When gender is not organised around sex, the system tends to proliferate.
Semantic-only assignment in the sample is restricted to non-sex-based systems. Sex-based systems typically have formal correlates.
All sex-based systems in the sample use semantic + formal assignment.
European languages in the sample (and Hindi-Urdu) all have canonical gender systems (sex-based, 2 or 3 genders, semantic + formal).
Every 2- or 3-gender sex-based language in the sample exposes the
appropriate Features.SurfaceGender values via the
attestedSurfaceGenders bridge field. Connects the typology layer
to the per-noun lexical layer.
Three first-class disagreements between Corbett's 1991 monograph and
his 2013 WALS chapters. The 1991 values are this Studies file's
english/dyirbal/georgian overrides; the 2013 values are the
Fragment-side Fragments.{Lang}.Gender.genderTypology (which goes
through GenderProfile.fromWALS). Linglib's interconnection-density
thesis: incompatibilities visible.
Corbett 1991 vs 2013: English. The 1991 monograph applies a strict controller-marking criterion (English has none); the 2013 WALS chapter treats the he/she/it pronoun distinction as a 3-gender system.
Corbett 1991 vs 2013: Dyirbal. The 1991 monograph treats Dyirbal as non-sex-based (organising principles cut across sex); the 2013 WALS chapter codes it as sex-based on the "Class I includes males" criterion.
Corbett 1991 vs 2013: Georgian. The 1991 monograph treats Georgian's rationality/animacy split as a 4-class gender system (agreement on pronouns + verbs); the 2013 WALS chapter codes it as no-gender on the noun-side-marking criterion.