From c63fa2bbbdde21fb45ea98e3c69bbddbf25097a5 Mon Sep 17 00:00:00 2001 From: Melody Horn Date: Thu, 29 Oct 2020 17:26:20 -0600 Subject: start to elaborate on syntax --- language/flow-control.rst | 6 ++++++ language/index.rst | 26 ++++++++++++++++++++++++++ language/scanning.rst | 42 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 74 insertions(+) create mode 100644 language/flow-control.rst create mode 100644 language/index.rst create mode 100644 language/scanning.rst (limited to 'language') diff --git a/language/flow-control.rst b/language/flow-control.rst new file mode 100644 index 0000000..853b5bd --- /dev/null +++ b/language/flow-control.rst @@ -0,0 +1,6 @@ +Flow Control +============ + +.. crowbar:keyword:: break + + This keyword exits the containing loop. diff --git a/language/index.rst b/language/index.rst new file mode 100644 index 0000000..79702c5 --- /dev/null +++ b/language/index.rst @@ -0,0 +1,26 @@ +Language +======== + +The syntax of Crowbar is designed to be similar to the syntax of C. + +A Crowbar source file is UTF-8. +Crowbar source files can come in two varieties: + +.. glossary:: + + header file + A Crowbar source file declaring types and functions. + Can be intended for internal use within a project, or to define the public API of a library. + Conventionally has the ``.hro`` file extension. + + implementation file + A Crowbar source file providing function definitions, and sometimes its own type declarations. + Conventionally has the ``.cro`` file extension. + +A Crowbar source file is read into memory in two phases: *scanning* (which converts text into an unstructured sequence of tokens) and *parsing* (which converts an unstructured sequence of tokens into a parse tree). + +.. toctree:: + :maxdepth: 1 + + scanning + flow-control diff --git a/language/scanning.rst b/language/scanning.rst new file mode 100644 index 0000000..7a7b7d3 --- /dev/null +++ b/language/scanning.rst @@ -0,0 +1,42 @@ +Scanning +-------- + +.. glossary:: + + keyword + One of the literal words ``bool``, :crowbar:ref:`break`, ``case``, + ``char``, ``const``, ``continue``, ``default``, ``do``, ``double``, + ``else``, ``enum``, ``extern``, ``float``, ``for``, ``fragile``, + ``function``, ``if``, ``include``, ``int``, ``long``, ``return``, + ``short``, ``signed``, ``sizeof``, ``struct``, ``switch``, + ``unsigned``, ``void``, or ``while``. + + identifier + A nonempty sequence of characters blah blah blah + + .. todo:: + + figure out https://www.unicode.org/reports/tr31/tr31-33.html + + decimal constant + A sequence of characters matching the regular expression ``[0-9_]+``. + Denotes the numeric value of the given sequence of decimal digits. + Underscores are ignored by the compiler, but may be useful separators for other readers. + + binary constant + A sequence of characters matching the regular expression ``0[bB][01_]+``. + Denotes the numeric value of the given sequence of binary digits (after the ``0[bB]`` prefix has been removed). + Underscores are ignored by the compiler, but may be useful separators for other readers. + + octal constant + A sequence of characters matching the regular expression ``0o[0-7_]+``. + Denotes the numeric value of the given sequence of octal digits (after the ``0o`` prefix has been removed). + Underscores are ignored by the compiler, but may be useful separators for other readers. + + token + A single atomic unit in a Crowbar source file. + Has one (and exactly one) of the following types. + +.. todo:: + + finish transcribing token definitions -- cgit v1.2.3