From f1fcbe3d5bf5158f39b89f0eb3677db50fb2f8fd Mon Sep 17 00:00:00 2001 From: Melody Horn Date: Sun, 25 Oct 2020 11:19:42 -0600 Subject: tidy up Unicode categories --- syntax.md | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/syntax.md b/syntax.md index c312dfc..89c1c49 100644 --- a/syntax.md +++ b/syntax.md @@ -57,20 +57,19 @@ A *keyword* is one of the following literal words: An *identifier* is a sequence of one or more characters having Unicode categories within a legal set. The first character in an identifier must have one of the following Unicode categories: -- Connector Punctuation (e.g. `_`) -- Format Other (e.g. Zero-Width Joiner) -- Lowercase Letter (e.g. `h`) -- Modifier Letter (e.g. `ʹ`, U+02B9 Modifier Letter Prime) -- Modifier Symbol (e.g. `^`, U+005E Circumflex Accent) -- Nonspacing Mark (e.g. ` ̂`, U+0302 Combining Circumflex Accent) -- Other Letter (e.g. `א`, U+05D0 Hebrew Letter Alef) -- Titlecase Letter (e.g. `Dž`, U+01C5 Latin Capital Letter D With Small Letter Z With Caron) -- Uppercase Letter (e.g. `B`) +- `Pc` Connector Punctuation (e.g. `_`) +- `Ll` Lowercase Letter (e.g. `h`) +- `Lm` Modifier Letter (e.g. `ʹ`, U+02B9 Modifier Letter Prime) +- `Lo` Other Letter (e.g. `א`, U+05D0 Hebrew Letter Alef) +- `Lt` Titlecase Letter (e.g. `Dž`, U+01C5 Latin Capital Letter D With Small Letter Z With Caron) +- `Lu` Uppercase Letter (e.g. `B`) +- `Mn` Nonspacing Mark (e.g. ` ̂`, U+0302 Combining Circumflex Accent) +- `Sk` Modifier Symbol (e.g. `^`, U+005E Circumflex Accent) Subsequent characters may have any of the above-listed Unicode categories, or one of the following: -- Decimal Digit Number (e.g. `0`) -- Letter Number (e.g. `Ⅳ`, U+2163 Roman Numeral Four) -- Other Number (e.g. `¼`, U+00BC Vulgar Fraction One Quarter) +- `Nd` Decimal Digit Number (e.g. `0`) +- `Nl` Letter Number (e.g. `Ⅳ`, U+2163 Roman Numeral Four) +- `No` Other Number (e.g. `¼`, U+00BC Vulgar Fraction One Quarter) ## Constants -- cgit v1.2.3