Japanese Prosody Fragment #
@cite{beckman-pierrehumbert-1986} @cite{kawahara-2015}
Japanese prosodic entries following the autosegmental-metrical analysis of @cite{beckman-pierrehumbert-1986}, with accent assignment rules and affix typology from @cite{kawahara-2015}.
Key Properties #
- Lexical accent: accent location specified in the lexicon as a linked H tone. The accent shape is fixed (H*+L); only the location varies.
- Accented vs unaccented: some words are lexically unaccented (e.g., compound nouns formed from unaccented words). Unaccented words can form well-formed utterances without any pitch accent.
- Accentual phrase: delimited by a phrasal H (on the second sonorant mora) and a boundary L. Contains at most one accent.
- Sparse tonal specification: only the accent H, the phrasal H, and the boundary L are specified; F0 between them is interpolated.
- Culminativity: at most one HL fall per prosodic word — Japanese is a pitch-accent language, not a tone language.
- Default accent: loanwords and nonce words receive antepenultimate accent (AAR) or Latin Stress Rule (LSR) accent.
- Eight affix accent types: recessive, dominant, recessive pre-accenting, dominant pre-accenting, accent-shifting, post-accenting, deaccenting, initial-accenting.
A Japanese lexical entry with prosodic specification.
The accent is specified as the mora position of the linked H tone
(0-indexed from the beginning of the word). Unaccented words have
accentMora = none.
- form : String
Surface form (romanized)
- gloss : String
Gloss
- accentMora : Option ℕ
Mora position of the accent (none = unaccented)
- nMorae : ℕ
Number of morae in the word
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Convert to Bool accentedness for bridge to AccentualPhrase.
Equations
- e.accentedBool = e.isAccented
Instances For
kami 'god' — accented on first mora (initial accent). Contrasts with kami 'paper' (unaccented) and kamí 'hair' (accent on second mora).
Equations
- Fragments.Japanese.Prosody.kami_god = { form := "kami", gloss := "god", accentMora := some 0, nMorae := 2 }
Instances For
kami 'paper' — unaccented. No HL fall in the accentual phrase.
Equations
- Fragments.Japanese.Prosody.kami_paper = { form := "kami", gloss := "paper", accentMora := none, nMorae := 2 }
Instances For
uma'i — accented adjective (§2.2, Figs. 6, 8, 9).
Equations
- Fragments.Japanese.Prosody.umai = { form := "umai", gloss := "delicious", accentMora := some 1, nMorae := 3 }
Instances For
amai — accented adjective (§2.3, Fig. 8).
Equations
- Fragments.Japanese.Prosody.amai = { form := "amai", gloss := "sweet", accentMora := some 1, nMorae := 3 }
Instances For
mame — unaccented noun (§2.2, Fig. 6).
Equations
- Fragments.Japanese.Prosody.mame = { form := "mame", gloss := "beans", accentMora := none, nMorae := 2 }
Instances For
ame — unaccented noun (§2.2, Fig. 6).
Equations
- Fragments.Japanese.Prosody.ame = { form := "ame", gloss := "rain", accentMora := none, nMorae := 2 }
Instances For
A Japanese lexical entry extending JProsodicEntry with the two
annotations needed for frequency-conditioned phonology (e.g., the
Breiss-Katsuda-Kawahara compounds in
Phenomena/Phonology/Studies/BreissKatsudaKawahara2026.lean):
a corpus token log-frequency and a free/bound flag.
Following CLAUDE.md's "infrastructure on demand", these annotations
are kept on a thin extension structure rather than added to
JProsodicEntry, so existing accent-only consumers are unaffected.
The HasTokenFreq typeclass instance below makes this entry
consumable by any module under Theories/Phonology/ItemSpecificity/.
- form : String
- gloss : String
- accentMora : Option ℕ
- nMorae : ℕ
- tokenLogFreq : ℚ
Token log-frequency in a reference corpus (e.g., BCCWJ).
0conventionally means "log of 1 occurrence" — used as the no-info default for unannotated items. Stored asℚso that the lexicon remains computable while the abstractTheories/Phonology/interface coerces toℝ. - canStandAlone : Bool
Can this morpheme stand alone as a wordform?
falsefor bound stems that occur only in compounds (e.g., the bound N2s targeted in @cite{breiss-katsuda-kawahara-2026}).
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
HasTokenFreq instance routing tokenLogFreq through the
fragment-level ℚ field into the abstract LogFreq := ℝ interface
used by Theories/Phonology/ItemSpecificity/. Rat.cast is the
standard mathlib coercion. The instance is noncomputable because
ℝ is noncomputable; the ℚ field itself remains computable for
decide-style proofs.
Equations
- Fragments.Japanese.Prosody.instHasTokenFreqJLexicalEntry = { tokenLogFreq := fun (e : Fragments.Japanese.Prosody.JLexicalEntry) => ↑e.tokenLogFreq }
A Japanese N1 + N2 nominal compound. Compound-medial position is the locus of voiced velar nasalisation (/g/ → [ŋ]) studied in @cite{breiss-katsuda-kawahara-2026}: obligatory when N2 is bound, optional and frequency-conditioned when N2 is free.
The compound's own tokenLogFreq is independent of N1's and
N2's — high-frequency compounds with low-frequency components, and
vice versa, both occur. Frequency-conditioned theories that treat
the compound's frequency as inherited from constituents (e.g., some
RepresentationStrength variants with multiplicative inheritance)
must reconcile this independence with empirical reality.
- n1 : JLexicalEntry
- n2 : JLexicalEntry
- compoundLogFreq : ℚ
Token log-frequency of the compound as a unit — typically much lower than either constituent in isolation, but the principal conditioning variable on optional nasalisation.
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
A compound's nasalisation is obligatory iff its N2 is bound. The free-N2 case is the gradient one tested in @cite{breiss-katsuda-kawahara-2026}.
Equations
Instances For
Japanese accentual phrase tonal specification.
@cite{beckman-pierrehumbert-1986} §2.2: the AP is defined by:
- A boundary L at the beginning (or end of preceding AP)
- A phrasal H on the second sonorant mora
- An optional accent HL (if the word is accented)
- A boundary L at the end
The phrasal H is NOT the same as H-tone spreading from the accent; it has its own local pitch range and is always present, even in unaccented phrases (Fig. 3 vs earlier accounts).
- words : List JProsodicEntry
Words grouped in this AP
- hasPhrasalH : Bool
Whether the phrasal H is present (always true in Japanese)
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
An AP is accented if any word in it is accented.
Equations
- ap.isAccented = ap.words.any fun (x : Fragments.Japanese.Prosody.JProsodicEntry) => x.isAccented
Instances For
Convert to the generic AccentualPhrase type. Japanese accent shape is always H*+L; unaccented APs get null.
Equations
- ap.toGeneric = { accent := if ap.isAccented = true then Features.Prosody.PitchAccent.H_star_plus_L else Features.Prosody.PitchAccent.null, nWords := ap.words.length }
Instances For
Accented words have accent location.
Unaccented words lack accent location.
Japanese accent is lexical.
The Japanese pitch accent shape is H*+L (a single bitonal accent).
A Japanese accented AP always triggers catathesis (because H*+L is bitonal).
A Japanese unaccented AP never triggers catathesis.
An AP containing only unaccented words is unaccented.
An AP containing an accented word is accented.
Japanese suffix accent specification.
Japanese suffixes exhibit the same dominant/recessive distinction as IE accent systems (@cite{kiparsky-halle-1977}) and GT systems (@cite{rolle-2018}). Dominant suffixes remove stem accent; recessive suffixes preserve it when present.
- form : String
- gloss : String
- dominance : Features.Prosody.ProsodicDominance
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
-teki (的): deaccenting suffix. Removes stem accent regardless of whether the stem is accented or unaccented — classified as subtractive-dominant in GT terms (@cite{kawahara-2015}).
Equations
- Fragments.Japanese.Prosody.teki_suffix = { form := "-teki", gloss := "的 ADJ", dominance := Features.Prosody.ProsodicDominance.dominant }
Instances For
-si (氏): non-deaccenting suffix. Preserves stem accent when present — classified as recessive (@cite{kawahara-2015}).
Equations
- Fragments.Japanese.Prosody.si_suffix = { form := "-si", gloss := "氏 Mr.", dominance := Features.Prosody.ProsodicDominance.recessive }
Instances For
Deaccenting suffixes are dominant.
Non-deaccenting suffixes are not dominant.
Derive the accent of a suffixed word from stem accent + suffix dominance.
Equations
- Fragments.Japanese.Prosody.suffixedAccent stem suffix = Features.Prosody.ProsodicDominance.combineAccent stem.accentMora suffix.dominance
Instances For
-teki deaccents kami 'god' (accented).
-teki leaves kami 'paper' (unaccented) unchanged.
-teki neutralizes the kami 'god' / kami 'paper' contrast.
-si preserves the accent on kami 'god'.
-si preserves the unaccentedness of kami 'paper'.
-si maintains the kami 'god' / kami 'paper' contrast.
Loanword prosodic entry. Extends JProsodicEntry with syllable weight
profile for testing AAR vs LSR predictions.
- entry : JProsodicEntry
- weights : List Phonology.Syllable.SyllWeight
Syllable weight profile (left to right)
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
kurisumasu 'Christmas' — accent on antepenultimate mora (su). @cite{kawahara-2015} (10a).
Equations
- One or more equations did not get rendered due to their size.
Instances For
asufaruto 'asphalt' — accent on antepenultimate mora (fa). @cite{kawahara-2015} (10g).
Equations
- One or more equations did not get rendered due to their size.
Instances For
makudonarudo 'McDonald' — accent on antepenultimate mora (na). @cite{kawahara-2015} (10h).
Equations
- One or more equations did not get rendered due to their size.
Instances For
amerika 'America' — unaccented (4-mora with two final light σ). @cite{kawahara-2015} (16a).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Loanword accent matches AAR prediction for all-light syllables.
Loanword accent matches LSR prediction for all-light syllables.
HLH: AAR predicts penultimate (σ₂), LSR predicts antepenultimate (σ₁). @cite{kawahara-2015} Table 1, row (c).
LLH: AAR predicts penultimate (σ₂), LSR predicts antepenultimate (σ₁). @cite{kawahara-2015} Table 1, row (g).
The remaining 6 conditions in Table 1 produce matching predictions.
Japanese suffix with fine-grained accent classification.
- form : String
- gloss : String
- accentType : Features.Prosody.AffixAccentType
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
-tara (conditional): recessive suffix — bears accent, loses to root. @cite{kawahara-2015} (29).
Equations
- Fragments.Japanese.Prosody.tara_suffix = { form := "-tara", gloss := "conditional", accentType := Features.Prosody.AffixAccentType.recessive }
Instances For
-ppoi (-ish): dominant suffix — bears accent, overrides root. @cite{kawahara-2015} (30).
Equations
- Fragments.Japanese.Prosody.ppoi_suffix = { form := "-ppoi", gloss := "-ish", accentType := Features.Prosody.AffixAccentType.dominant }
Instances For
-si (Mr.): recessive pre-accenting — inserts accent on root-final σ when root is unaccented, preserves root accent when present. @cite{kawahara-2015} (31).
Equations
- Fragments.Japanese.Prosody.si_affix = { form := "-si", gloss := "Mr.", accentType := Features.Prosody.AffixAccentType.recessivePreAccent }
Instances For
-ke (family of): dominant pre-accenting — always inserts accent on root-final σ, deleting any root accent. @cite{kawahara-2015} (32).
Equations
- Fragments.Japanese.Prosody.ke_suffix = { form := "-ke", gloss := "family of", accentType := Features.Prosody.AffixAccentType.dominantPreAccent }
Instances For
-mono (thing): accent-shifting — shifts existing root accent to pre-suffix position. Unaccented roots stay unaccented. @cite{kawahara-2015} (33).
Equations
- Fragments.Japanese.Prosody.mono_suffix = { form := "-mono", gloss := "thing", accentType := Features.Prosody.AffixAccentType.accentShifting }
Instances For
o- (honorific prefix): post-accenting — inserts accent after prefix. @cite{kawahara-2015} (34).
Equations
- Fragments.Japanese.Prosody.o_prefix = { form := "o-", gloss := "honorific", accentType := Features.Prosody.AffixAccentType.postAccenting }
Instances For
-teki (的 -like): deaccenting — deletes root accent, no new accent. @cite{kawahara-2015} (36).
Equations
- Fragments.Japanese.Prosody.teki_affix = { form := "-teki", gloss := "的 -like", accentType := Features.Prosody.AffixAccentType.deaccenting }
Instances For
-zu (group/plural): initial-accenting — inserts accent on root-initial σ. @cite{kawahara-2015} (39).
Equations
- Fragments.Japanese.Prosody.zu_suffix = { form := "-zu", gloss := "group", accentType := Features.Prosody.AffixAccentType.initialAccenting }
Instances For
Recessive pre-accenting is recessive at the coarse level (preserves root accent when present).
Deaccenting is dominant at the coarse level (overrides root accent).
This corrects the earlier classification of -teki in this fragment,
which used ProsodicDominance.dominant — functionally the same
projection, but the fine-grained type makes the behavior explicit.
Accent-shifting is recessive at the coarse level: it only operates on accent that is already present, never creating new accent.
kabuto+musi 'beetle': short N2 (musi, 2μ) pre-accents on N1-final syllable. @cite{kawahara-2015} (22a).
sin+yokohama 'Shin-Yokohama': long N2 (yokohama, 4μ, unaccented) → accent on N2-initial syllable. @cite{kawahara-2015} (23a).
sin+tamane'gi 'new onion': long N2 retains accent. @cite{kawahara-2015} (24a).
Unaccented trisyllable ame+ga → LHH (initial rise + H spread).
Accented ka'mi+ga → HLL (accent HL + L spread).