From c63fa2bbbdde21fb45ea98e3c69bbddbf25097a5 Mon Sep 17 00:00:00 2001
From: Melody Horn <melody@boringcactus.com>
Date: Thu, 29 Oct 2020 17:26:20 -0600
Subject: start to elaborate on syntax

---
 language/flow-control.rst |  6 ++++++
 language/index.rst        | 26 ++++++++++++++++++++++++++
 language/scanning.rst     | 42 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 74 insertions(+)
 create mode 100644 language/flow-control.rst
 create mode 100644 language/index.rst
 create mode 100644 language/scanning.rst

(limited to 'language')

diff --git a/language/flow-control.rst b/language/flow-control.rst
new file mode 100644
index 0000000..853b5bd
--- /dev/null
+++ b/language/flow-control.rst
@@ -0,0 +1,6 @@
+Flow Control
+============
+
+.. crowbar:keyword:: break
+
+    This keyword exits the containing loop.
diff --git a/language/index.rst b/language/index.rst
new file mode 100644
index 0000000..79702c5
--- /dev/null
+++ b/language/index.rst
@@ -0,0 +1,26 @@
+Language
+========
+
+The syntax of Crowbar is designed to be similar to the syntax of C.
+
+A Crowbar source file is UTF-8.
+Crowbar source files can come in two varieties:
+
+.. glossary::
+
+    header file
+        A Crowbar source file declaring types and functions.
+        Can be intended for internal use within a project, or to define the public API of a library.
+        Conventionally has the ``.hro`` file extension.
+
+    implementation file
+        A Crowbar source file providing function definitions, and sometimes its own type declarations.
+        Conventionally has the ``.cro`` file extension.
+
+A Crowbar source file is read into memory in two phases: *scanning* (which converts text into an unstructured sequence of tokens) and *parsing* (which converts an unstructured sequence of tokens into a parse tree).
+
+..  toctree::
+    :maxdepth: 1
+    
+    scanning
+    flow-control
diff --git a/language/scanning.rst b/language/scanning.rst
new file mode 100644
index 0000000..7a7b7d3
--- /dev/null
+++ b/language/scanning.rst
@@ -0,0 +1,42 @@
+Scanning
+--------
+
+.. glossary::
+
+    keyword
+        One of the literal words ``bool``, :crowbar:ref:`break`, ``case``,
+        ``char``, ``const``, ``continue``, ``default``, ``do``, ``double``,
+        ``else``, ``enum``, ``extern``, ``float``, ``for``, ``fragile``,
+        ``function``, ``if``, ``include``, ``int``, ``long``, ``return``,
+        ``short``, ``signed``, ``sizeof``, ``struct``, ``switch``,
+        ``unsigned``, ``void``, or ``while``.
+    
+    identifier
+        A nonempty sequence of characters blah blah blah
+
+        .. todo::
+
+            figure out https://www.unicode.org/reports/tr31/tr31-33.html
+
+    decimal constant
+        A sequence of characters matching the regular expression ``[0-9_]+``.
+        Denotes the numeric value of the given sequence of decimal digits.
+        Underscores are ignored by the compiler, but may be useful separators for other readers.
+    
+    binary constant
+        A sequence of characters matching the regular expression ``0[bB][01_]+``.
+        Denotes the numeric value of the given sequence of binary digits (after the ``0[bB]`` prefix has been removed).
+        Underscores are ignored by the compiler, but may be useful separators for other readers.
+    
+    octal constant
+        A sequence of characters matching the regular expression ``0o[0-7_]+``.
+        Denotes the numeric value of the given sequence of octal digits (after the ``0o`` prefix has been removed).
+        Underscores are ignored by the compiler, but may be useful separators for other readers.
+
+    token
+        A single atomic unit in a Crowbar source file.
+        Has one (and exactly one) of the following types.
+
+.. todo::
+
+    finish transcribing token definitions
-- 
cgit v1.2.3