aboutsummaryrefslogtreecommitdiff
path: root/syntax.md
diff options
context:
space:
mode:
Diffstat (limited to 'syntax.md')
-rw-r--r--syntax.md36
1 files changed, 17 insertions, 19 deletions
diff --git a/syntax.md b/syntax.md
index bf0d3f1..0ca25b3 100644
--- a/syntax.md
+++ b/syntax.md
@@ -1,10 +1,8 @@
----
-title: Syntax
----
+# Syntax
The syntax of Crowbar mostly matches the syntax of C, with fewer obscure/advanced/edge case features.
-# Source Files
+## Source Files
A Crowbar source file is UTF-8.
Crowbar source files can come in two varieties, an *implementation file* and a *header file*.
@@ -12,7 +10,7 @@ An implementation file conventionally has a `.cro` extension, and a header file
A Crowbar source file is read into memory in two phases: *scanning* (which converts text into an unstructured sequence of tokens) and *parsing* (which converts an unstructured sequence of tokens into a parse tree).
-# Scanning
+## Scanning
A *token* is one of the following kinds of token:
@@ -24,7 +22,7 @@ A *token* is one of the following kinds of token:
Tokens are separated by either *whitespace* or a *comment*.
-## Keywords
+### Keywords
A *keyword* is one of the following literal words:
@@ -59,7 +57,7 @@ A *keyword* is one of the following literal words:
- `void`
- `while`
-## Identifiers
+### Identifiers
An *identifier* is a sequence of one or more characters having Unicode categories within a legal set.
@@ -80,7 +78,7 @@ Subsequent characters may have any of the above-listed Unicode categories, or on
- `Nl` Letter Number (e.g. `Ⅳ`, U+2163 Roman Numeral Four)
- `No` Other Number (e.g. `¼`, U+00BC Vulgar Fraction One Quarter)
-## Constants
+### Constants
A *constant* can have one of six types:
@@ -94,7 +92,7 @@ A *constant* can have one of six types:
- or a `.` followed by a decimal constant followed by either an `e` or `E` followed by a decimal constant;
- or a *character constant*, a `'` followed by either a single character or an *escape sequence* followed by another `'`.
-### Escape Sequences
+#### Escape Sequences
The following sequences of characters are *escape sequences*:
@@ -109,13 +107,13 @@ The following sequences of characters are *escape sequences*:
- `\u` followed by four characters drawn from the set {`0`, `1`, `2`, `3`, `4`, `5`, `6`, `7`, `8`, `9`, `A`, `a`, `B`, `b`, `C`, `c`, `D`, `d`, `E`, `e`, `F`, `f`}
- `\U` followed by eight characters drawn from the set {`0`, `1`, `2`, `3`, `4`, `5`, `6`, `7`, `8`, `9`, `A`, `a`, `B`, `b`, `C`, `c`, `D`, `d`, `E`, `e`, `F`, `f`}
-## String Literals
+### String Literals
A *string literal* begins with a `"`.
It then contains a sequence where each element is either an escape sequence or a character that is neither `"` nor `\`.
It then ends with a `"`.
-## Punctuators
+### Punctuators
The following sequences of characters form *punctuators*:
@@ -161,11 +159,11 @@ The following sequences of characters form *punctuators*:
- `|=`
- `^=`
-## Whitespace
+### Whitespace
A nonempty sequence of characters is considered to be *whitespace* if each character in it has a Unicode class of either Space Separator or Control Other.
-## Comments
+### Comments
A *comment* can be either a *line comment* or a *block comment*.
@@ -173,11 +171,11 @@ A *line comment* begins with the characters `//` if they occur outside of a stri
A *block comment* begins with the characters `/*` if they occur outside of a string literal or comment, and ends with the characters `*/`.
-# Parsing
+## Parsing
The syntax of Crowbar is given as a [parsing expression grammar](https://en.wikipedia.org/wiki/Parsing_expression_grammar):
-## Entry points
+### Entry points
```
HeaderFile ← HeaderFileElement+
@@ -190,7 +188,7 @@ ImplementationFileElement ← HeaderFileElement /
FunctionDefinition
```
-## Top-level elements
+### Top-level elements
```
IncludeStatement ← 'include' string-literal ';'
@@ -211,7 +209,7 @@ SignatureArguments ← Type identifier ',' SignatureArguments /
Type identifier ','?
```
-## Statements
+### Statements
```
Block ← '{' Statement* '}'
@@ -263,7 +261,7 @@ AssignmentStatementBody ← AssignmentTargetExpression '=' Expression /
ExpressionStatement ← Expression ';'
```
-## Types
+### Types
```
Type ← 'const' BasicType /
@@ -288,7 +286,7 @@ IntegerType ← 'char' /
'long'
```
-## Expressions
+### Expressions
```
AssignmentTargetExpression ← identifier ATEElementSuffix*