From fa1e8d9d6c2dad41c9c2e7c9efac8502ed321b4c Mon Sep 17 00:00:00 2001 From: Tab Atkins-Bittner Date: Wed, 8 Sep 2021 10:18:25 -0700 Subject: [PATCH] Allow node names to have type annotations. Bump JiK to take advantage of this. --- JSON-IN-KDL.md | 26 ++++++++++++-------------- SPEC.md | 19 ++++++++++++++----- 2 files changed, 26 insertions(+), 19 deletions(-) diff --git a/JSON-IN-KDL.md b/JSON-IN-KDL.md index 3addc99..5340cce 100644 --- a/JSON-IN-KDL.md +++ b/JSON-IN-KDL.md @@ -3,7 +3,7 @@ JSON-in-KDL (JiK) This specification describes a canonical way to losslessly encode [JSON](https://json.org) in [KDL](https://kdl.dev). While this isn't a very useful thing to want to do on its own, it's occasionally useful when using a KDL toolchain while speaking with a JSON-consuming or -emitting service. -This is version 1.0.1 of JiK. +This is version 2.0.0 of JiK. JSON-in-KDL (JiK from now on) is a kdl microsyntax consisting of three types of nodes: @@ -44,16 +44,14 @@ array 1 { Object nodes are used to represent a JSON object. They can contain zero or more named properties, followed by zero or more child nodes; these are taken as the key/value pairs of the object, in order of appearance. -The named properties of an object node are key/value pairs, used when the value is a literal. +If the value of a key/value pair is a literal, it can be encoded as a named property on the object. For example, the JSON object `{"foo": 1, "bar": true}` could be written in JiK as `object foo=1 bar=true`. -For example, the JSON object `{"foo": 1, "bar": true}` could be written in JiK as `object foo=1 bar=true`. - -The children of an object node have a slightly modified syntax: they must contain a string as their first value, giving their "key"; the child itself, besides the "key" argument, is the "value". The preceding example could instead have been written as: +Alternately, key/value pairs can be encoded as child nodes, using a type annotation on the node name to encode the key, and the node itself as the value. The preceding example could instead have been written as: ```kdl object { - - "foo" 1 - - "bar" true + (foo)- 1 + (bar)- true } ``` @@ -61,18 +59,18 @@ Of course, using children for literals is overly-verbose. It's only necessary wh ```kdl object { - array "foo" 1 2 { + (foo)array 1 2 { object bar=3 } - - "baz" 4 + (baz)- 4 } ``` -As with arrays, child lists and arguments can be mixed. The precise order of a JSON object's keys isn't *meant* to be meaningful, so as long as that's true, *all* the keys with literal values can be pulled into the argument list. The preceding example could thus also be written as: +As with arrays, child nodes and properties can be mixed. The precise order of a JSON object's keys isn't *meant* to be meaningful, so as long as that's true, *all* the keys with literal values can be pulled into the argument list. The preceding example could thus also be written as: ```kdl object baz=4 { - array "foo" 1 2 { + (foo)array 1 2 { object bar=3 } } @@ -84,8 +82,8 @@ Converting JiK back to JSON is a trivial process: literal nodes are encoded as t Only valid JiK nodes can be encoded to JSON; if a JiK document contains an invalid node, the entire document must fail to encode, rather than "guessing" at the intent. As well, a JiK document must contain only a single top-level node to be valid, unless the output is intended to be a JSON stream, in which case arbitrary numbers of nodes are allowed, each a separate JSON value. -* A literal node is valid if it contains a single unnamed argument (or, if a child of an object node, a single unnamed string argument followed by a single unnamed argument). +* A literal node is valid if it contains a single unnamed argument. -* An array node is valid if it contains no named properties. If it's the child of an object node, it must contain, as its first argument, a string literal. +* An array node is valid if it contains only unnamed arguments and/or child nodes without type annotations on their node names. -* An object node is valid if it contains no unnamed arguments, with the exception that if it's the child of an object node, it must contain, as its first argument, a string literal. It must not contain any repeated "key" strings among its children, whether expressed as named properties or child nodes. +* An object node is valid if it contains only named properties and/or child nodes with type annotations on their node names. Additionally, all "keys" must be unique within the node, whether they're encoded as property names or type annotations on node names. diff --git a/SPEC.md b/SPEC.md index 25e71bd..6a13470 100644 --- a/SPEC.md +++ b/SPEC.md @@ -46,6 +46,11 @@ Being a node-oriented language means that the real core component of any KDL document is the "node". Every node must have a name, which is either a legal [Identifier](#identifier), or a quoted [String](#string). +The name may be preceded by a [Type Annotation](#type-annotation) to further +clarify its type, particularly in relation to its parent node. (For example, +clarifying that a particular `date` child node is for the _publication_ date, +rather than the last-modified date, with `(published)date`.) + Following the name are zero or more [Arguments](#argument) or [Properties](#property), separated by either [whitespace](#whitespace) or [a slash-escaped line continuation](#line-continuation). Arguments and Properties @@ -72,7 +77,7 @@ Block](#children-block), a semicolon (`;`) or the end of the file/stream (an ```kdl foo 1 key="val" 3 { bar - baz + (role)baz 1 2 } ``` @@ -193,13 +198,15 @@ Values _MAY_ be prefixed by a single [Type Annotation](#type-annotation). ### Type Annotation -A type annotation is a prefix to any [Value](#value) that includes a -_suggestion_ of what type the value is _intended_ to be treated as. +A type annotation is a prefix to any [Node Name](#node) or [Value](#value) that +includes a _suggestion_ of what type the value is _intended_ to be treated as, +or as a _context-specific elaboration_ of the more generic type the node name +indicates. Type annotations are written as a set of `(` and `)` with a single [Identifier](#identifier) in it. Any valid identifier is considered a valid type annotation. There must be no whitespace between a type annotation and its -associated Value. +associated Node Name or Value. KDL does not specify any restrictions on what implementations might do with these annotations. They are free to ignore them, or use them to make decisions @@ -259,6 +266,8 @@ IEEE 754 floating point numbers, both single (32) and double (64) precision: ```kdl node (u8)123 node prop=(regex)".*" +(published)date "1970-01-01" +(contributor)person name="Foo McBar" ``` ### String @@ -399,7 +408,7 @@ Note that for the purpose of new lines, CRLF is considered _a single newline_. ``` nodes := linespace* (node nodes?)? linespace* -node := ('/-' node-space*)? identifier (node-space node-space* node-props-and-args)* (node-space* node-children ws*)? node-space* node-terminator +node := ('/-' node-space*)? (type ws*)? identifier (node-space node-space* node-props-and-args)* (node-space* node-children ws*)? node-space* node-terminator node-props-and-args := ('/-' node-space*)? (prop | value) node-children := ('/-' node-space*)? '{' nodes '}' node-space := ws* escline ws* | ws+