From ce992c815f05c8487850872bfe44999a08b1f340 Mon Sep 17 00:00:00 2001 From: Melody Horn Date: Tue, 20 Oct 2020 10:25:25 -0600 Subject: add more details --- _build.yml | 13 +++++++++++++ index.md | 8 +++++--- safety.md | 24 +++++++++++++++++++++++- tagged-unions.md | 1 + types.md | 1 + 5 files changed, 43 insertions(+), 4 deletions(-) create mode 100644 _build.yml create mode 100644 tagged-unions.md create mode 100644 types.md diff --git a/_build.yml b/_build.yml new file mode 100644 index 0000000..3276c5f --- /dev/null +++ b/_build.yml @@ -0,0 +1,13 @@ +image: debian/stable +packages: + - pandoc + - wkhtmltopdf + - poppler-utils +sources: + - https://git.sr.ht/~boringcactus/crowbar-spec +tasks: + - page-count: | + cd crowbar-spec + pandoc -s -o ../spec.pdf -t html *.md + cd .. + pdfinfo spec.pdf | grep Pages diff --git a/index.md b/index.md index 55d1646..2d19c0a 100644 --- a/index.md +++ b/index.md @@ -26,7 +26,6 @@ Some of the footguns and complexity in C come from misfeatures that can simply n - Chaining relational/equality operators (e.g. `3 < x == 2`) - Mixed chains of bitwise or logical operators (e.g. `2 & x && 4 ^ y`) - The comma operator `,` -- Strings that aren't UTF-8 ### Explicit Beats Implicit @@ -53,14 +52,17 @@ Some of the footguns and complexity in C come from misfeatures that can simply n Some C features are footguns by default, so Crowbar ensures that they are only used correctly. -- Unions blah blah blah +- Unions are not robust by default. + Crowbar only supports unions when they are [tagged unions](tagged-unions.md). C's syntax isn't perfect, but it's usually pretty good. However, sometimes it just sucks, and in those cases Crowbar makes changes. -- Complicated types (function pointers, pointer-to-`const` vs `const`-pointer, etc) +- C's variable declaration syntax is far from intuitive in nontrivial cases (function pointers, pointer-to-`const` vs `const`-pointer, etc). + Crowbar uses [simplified type syntax](types.md) to keep types and variable names distinct. - `_Bool` is just `bool`, `_Complex` is just `complex` (why drag the preprocessor into it?) - Adding a `_` to numeric literals as a separator +- All string literals, char literals, etc are UTF-8 # Additions diff --git a/safety.md b/safety.md index d353227..6550492 100644 --- a/safety.md +++ b/safety.md @@ -4,7 +4,29 @@ Each item in Wikipedia's [list of types of memory errors](https://en.wikipedia.o ## Buffer overflow -bounds checking based on uhhhh something +Crowbar addresses buffer overflow with bounds checking. +In C, the type `char *` can point to a single character, a null-terminated string of unknown length, a buffer of fixed size, or nothing at all. +In Crowbar, the type `char *` can only point to either a single character or nothing at all. +If a buffer is declared as `char[50] name;` then it has type `char[50]`, and can be implicitly converted to `(char[50])*`, a pointer-to-50-chars. +If memory is dynamically allocated, it works as follows: + +```crowbar +void process(size_t bufferSize, char[bufferSize] buffer) { + // do some work with buffer, given that we know its size +} + +int main(int argc, (char[1024?])[argc] argv) { + size_t bufferSize = getBufferSize(); + (char[bufferSize])* buffer = malloc(bufferSize); + process(bufferSize, buffer); + free(buffer); +} +``` + +Note that `malloc` as part of the Crowbar standard library has signature `(char[size])* malloc(size_t size);` and so no cast is needed above. +Note as well that the type of `argv` is complicated. +This is because the elements of `argv` have unconstrained size. +TODO figure out if that's the right way to handle that ## Buffer over-read diff --git a/tagged-unions.md b/tagged-unions.md new file mode 100644 index 0000000..1ea1912 --- /dev/null +++ b/tagged-unions.md @@ -0,0 +1 @@ +TODO diff --git a/types.md b/types.md new file mode 100644 index 0000000..1ea1912 --- /dev/null +++ b/types.md @@ -0,0 +1 @@ +TODO -- cgit v1.2.3