From fb80016011b74e259f830cf119e9b0b19049d586 Mon Sep 17 00:00:00 2001 From: Bram Gotink Date: Fri, 2 Sep 2022 22:54:08 +0200 Subject: [PATCH 01/24] Make JSON-in-KDL more minimalistic (#293) --- JSON-IN-KDL.md | 78 ++++++++++++++++++++++++++++++++++---------------- 1 file changed, 54 insertions(+), 24 deletions(-) diff --git a/JSON-IN-KDL.md b/JSON-IN-KDL.md index 5340cce..96399bd 100644 --- a/JSON-IN-KDL.md +++ b/JSON-IN-KDL.md @@ -3,30 +3,28 @@ JSON-in-KDL (JiK) This specification describes a canonical way to losslessly encode [JSON](https://json.org) in [KDL](https://kdl.dev). While this isn't a very useful thing to want to do on its own, it's occasionally useful when using a KDL toolchain while speaking with a JSON-consuming or -emitting service. -This is version 2.0.0 of JiK. +This is version 3.0.0 of JiK. -JSON-in-KDL (JiK from now on) is a kdl microsyntax consisting of three types of nodes: +JSON-in-KDL (JiK from now on) is a kdl microsyntax consisting of named nodes that represent objects, arrays, or literal values. -* literal nodes, with `-` as the nodename -* array nodes, with `array` as the nodename -* object nodes, with `object` as the nodename +The name "-" is used for nodes that are nameless, i.e. the top-level node and items in an array. ---- Literal nodes are used to represent a JSON literal, which luckily KDL's literal syntax is a superset of. They contain a single value, the literal they're representing. For example, to represent the JSON literal `true`, you'd write `- true` in JiK. -(In many cases this isn't necessary, and KDL literals can be directly used instead. Literal nodes are necessary only for a top-level literal, or to intersperse literals with arrays or objects inside an array or object node.) +(In many cases this isn't necessary, and KDL literals can be directly used instead. Literal nodes are necessary only for a top-level literal, or as item in an array.) ---- Array nodes are used to represent a JSON array. They can contain zero or more unnamed arguments, followed by zero or more child nodes; these are taken as the items of the array, in order of appearance. -This means that simple arrays of literals can be written compactly and simply; a JSON array like `[1,2,3]` can be written in JiK as `array 1 2 3`. When an array contains nested arrays or objects, the child nodes are used; a JSON array like `[1, [true, false], 3]` can be written in JiK as: +This means that simple arrays of literals can be written compactly and simply; a JSON array like `[1,2,3]` can be written in JiK as `- 1 2 3`. When an array contains nested arrays or objects, the child nodes are used; a JSON array like `[1, [true, false], 3]` can be written in JiK as: ```kdl -array { +- { - 1 - array true false + - true false - 3 } ``` @@ -34,8 +32,8 @@ array { The two methods of writing children can be mixed, pulling the prefix of the array that is just literals into the arguments of the node. The preceding example could thus also be written as: ```kdl -array 1 { - array true false +- 1 { + - true false - 3 } ``` @@ -44,46 +42,78 @@ array 1 { Object nodes are used to represent a JSON object. They can contain zero or more named properties, followed by zero or more child nodes; these are taken as the key/value pairs of the object, in order of appearance. -If the value of a key/value pair is a literal, it can be encoded as a named property on the object. For example, the JSON object `{"foo": 1, "bar": true}` could be written in JiK as `object foo=1 bar=true`. +If the value of a key/value pair is a literal, it can be encoded as a named property on the object. For example, the JSON object `{"foo": 1, "bar": true}` could be written in JiK as `- foo=1 bar=true`. Alternately, key/value pairs can be encoded as child nodes, using a type annotation on the node name to encode the key, and the node itself as the value. The preceding example could instead have been written as: ```kdl -object { - (foo)- 1 - (bar)- true +- { + foo 1 + bar true } ``` Of course, using children for literals is overly-verbose. It's only necessary when nesting arrays or objects into objects; for example, the JSON object `{"foo": [1, 2, {"bar": 3}], "baz":4}` can be written in JiK as: ```kdl -object { - (foo)array 1 2 { - object bar=3 +- { + foo 1 2 { + - bar=3 } - (baz)- 4 + baz 4 } ``` As with arrays, child nodes and properties can be mixed. The precise order of a JSON object's keys isn't *meant* to be meaningful, so as long as that's true, *all* the keys with literal values can be pulled into the argument list. The preceding example could thus also be written as: ```kdl -object baz=4 { - (foo)array 1 2 { - object bar=3 +- baz=4 { + foo 1 2 { + - bar=3 } } ``` ---- +There are two cases where there can be ambiguity between the three kinds of nodes. These can be solved by explicitly marking the node as an array or object using a tag. + +An array with a single item cannot be represented using a node with a single value as that would make it a literal node. The `(array)` tag can be used to mark this node as an array instead. +For example, the node `- true` is the literal `true`, while `(array)- true` is the array `[true]`. + +An object with a single property named "-" that is encoded as a child node will be interpreted as an array with a single item. The `(object)` tag can be used to mark this node as an object instead. +For example, `- { - true; }` is the array `[true]`, while `(object)- { - true; }` is the object `{"-": true}`. + +---- + Converting JiK back to JSON is a trivial process: literal nodes are encoded as their literal value; array nodes are encoded as their items, comma-separated and surrounded with `[]`; object nodes are encoded as their key/value pairs, comma-separated and surrounded with `{}`. Only valid JiK nodes can be encoded to JSON; if a JiK document contains an invalid node, the entire document must fail to encode, rather than "guessing" at the intent. As well, a JiK document must contain only a single top-level node to be valid, unless the output is intended to be a JSON stream, in which case arbitrary numbers of nodes are allowed, each a separate JSON value. * A literal node is valid if it contains a single unnamed argument. -* An array node is valid if it contains only unnamed arguments and/or child nodes without type annotations on their node names. +* An array node is valid if it contains only unnamed arguments and/or child nodes named "-". -* An object node is valid if it contains only named properties and/or child nodes with type annotations on their node names. Additionally, all "keys" must be unique within the node, whether they're encoded as property names or type annotations on node names. +* An object node is valid if it contains only named properties and/or child nodes. Additionally, all "keys" must be unique within the node, whether they're encoded as property names or type annotations on node names. + +---- + +The name of the top-level JiK node is not taken into account. This allows for using a declarative node name instead of "-". + +It is possible to embed JiK inside KDL documents. Here's a fictitious example describing an HTTP request with a JSON body, where the `body` node is an embedded JiK node. + +```kdl +request "/api/cart" method="PUT" { + body { + items { + - id=1234 amount=1 + - id=2341 amount=2 { + options { + color "red" + size "XXL" + } + } + } + } +} +``` From 76d5dd542a9043257bc65476c0a70b94667052a7 Mon Sep 17 00:00:00 2001 From: Tab Atkins Jr Date: Fri, 2 Sep 2022 15:02:05 -0700 Subject: [PATCH 02/24] Editorial rephrasing; define empty array/object Did an editorial pass over the document, rewriting most of the prose slightly. Caught a place or two that was still referring to tags for object keys. Sole non-editorial change was adding the final ambiguous case - a completely empty node. These are required to have an `(array)` or `(object)` tag to be valid. --- JSON-IN-KDL.md | 117 +++++++++++++++++++++++++++++++++---------------- 1 file changed, 79 insertions(+), 38 deletions(-) diff --git a/JSON-IN-KDL.md b/JSON-IN-KDL.md index 96399bd..c256612 100644 --- a/JSON-IN-KDL.md +++ b/JSON-IN-KDL.md @@ -3,23 +3,24 @@ JSON-in-KDL (JiK) This specification describes a canonical way to losslessly encode [JSON](https://json.org) in [KDL](https://kdl.dev). While this isn't a very useful thing to want to do on its own, it's occasionally useful when using a KDL toolchain while speaking with a JSON-consuming or -emitting service. -This is version 3.0.0 of JiK. +This is version 3.0.1 of JiK. JSON-in-KDL (JiK from now on) is a kdl microsyntax consisting of named nodes that represent objects, arrays, or literal values. -The name "-" is used for nodes that are nameless, i.e. the top-level node and items in an array. +---- + +JSON literals are, luckily, a subset of KDL's literals. There are two ways to write a JSON literal into JiK: + +* As a node with any nodename and a single argument, like `- true` (for the JSON `true`) or `foo 5` (for the JSON `5`). +* When nested in arrays or objects, literals can usually be written as arguments (for array nodes) or properties (for object nodes). See below for details. ---- -Literal nodes are used to represent a JSON literal, which luckily KDL's literal syntax is a superset of. They contain a single value, the literal they're representing. For example, to represent the JSON literal `true`, you'd write `- true` in JiK. +JSON arrays are represented in JiK as a node with any nodename, with zero or more arguments and/or zero or more children with `-` nodenames. -(In many cases this isn't necessary, and KDL literals can be directly used instead. Literal nodes are necessary only for a top-level literal, or as item in an array.) +Arguments can encode literals - for example, the JSON `[1, 2, 3]` can be written in JiK as `- 1 2 3`. ----- - -Array nodes are used to represent a JSON array. They can contain zero or more unnamed arguments, followed by zero or more child nodes; these are taken as the items of the array, in order of appearance. - -This means that simple arrays of literals can be written compactly and simply; a JSON array like `[1,2,3]` can be written in JiK as `- 1 2 3`. When an array contains nested arrays or objects, the child nodes are used; a JSON array like `[1, [true, false], 3]` can be written in JiK as: +Children can encode literals and/or nested arrays and objects. For example, the JSON `[1, [true, false], 3]` can be written in JiK as: ```kdl - { @@ -29,7 +30,9 @@ This means that simple arrays of literals can be written compactly and simply; a } ``` -The two methods of writing children can be mixed, pulling the prefix of the array that is just literals into the arguments of the node. The preceding example could thus also be written as: +The arguments and/or children, taken in order, represent the items of the array. + +Arguments and children can be mixed, if desired. The preceding example could also be written as: ```kdl - 1 { @@ -38,51 +41,72 @@ The two methods of writing children can be mixed, pulling the prefix of the arra } ``` +Two otherwise-ambiguous cases must be manually annotated with an `(array)` tag: + +* A single-element array (such as `[1]`) written using arguments (as `- 1`) would be ambiguous with a literal node. + To indicate this is an array, it must be written as `(array)- 1` + (Or rewritten to use child nodes, like `- { - 1 }`.) +* An empty array (JSON `[]`) must use the `(array)` tag, like `(array)-`. + +The `(array)` tag can be used on any other valid array node if desired, but has no effect in such cases. + ---- -Object nodes are used to represent a JSON object. They can contain zero or more named properties, followed by zero or more child nodes; these are taken as the key/value pairs of the object, in order of appearance. +JSON objects are represented in JiK as a node with any nodename, with zero or more properties and/or zero or more children with any nodenames. -If the value of a key/value pair is a literal, it can be encoded as a named property on the object. For example, the JSON object `{"foo": 1, "bar": true}` could be written in JiK as `- foo=1 bar=true`. +Properties can encode literals - for example, the JSON `{"foo": 1, "bar": true}` can be written in JiK as `- foo=1 bar=true`. -Alternately, key/value pairs can be encoded as child nodes, using a type annotation on the node name to encode the key, and the node itself as the value. The preceding example could instead have been written as: +Children can encode literals and/or nested arrays and objects, +using the nodename for the item's key. +For example, the JSON `{"foo": 1, "bar": [2, {"baz": 3}], "qux":4}` can be written in JiK as: ```kdl - { foo 1 - bar true -} -``` - -Of course, using children for literals is overly-verbose. It's only necessary when nesting arrays or objects into objects; for example, the JSON object `{"foo": [1, 2, {"bar": 3}], "baz":4}` can be written in JiK as: - -```kdl -- { - foo 1 2 { - - bar=3 + bar 2 { + - baz=3 } - baz 4 + qux 4 } ``` -As with arrays, child nodes and properties can be mixed. The precise order of a JSON object's keys isn't *meant* to be meaningful, so as long as that's true, *all* the keys with literal values can be pulled into the argument list. The preceding example could thus also be written as: +As with arrays, child nodes and properties can be mixed, so the preceding example could have been written as: ```kdl -- baz=4 { - foo 1 2 { - - bar=3 +- foo=1 { + bar 2 { + - baz=3 + } + qux 4 +} +``` + +Or, so long as the exact order of properties isn't meaningful (it's not *meant* to be in JSON), +*all* the literal-valued keys can be pulled up into properties, +leaving children nodes solely for nested arrays and objects: + +```kdl +- foo=1 qux=4 { + bar 2 { + - baz=3 } } ``` ----- +The properties and/or children of the node represent the items of the object, +with the property names and child nodenames as each item's key. +All "keys" in an object node must be unique. -There are two cases where there can be ambiguity between the three kinds of nodes. These can be solved by explicitly marking the node as an array or object using a tag. +As with arrays, there are two ambiguous cases that must be manually annoted with the `(object)` tag: -An array with a single item cannot be represented using a node with a single value as that would make it a literal node. The `(array)` tag can be used to mark this node as an array instead. -For example, the node `- true` is the literal `true`, while `(array)- true` is the array `[true]`. +* An object containing a single item whose key is "-" (like `{"-": 1}`) written using children (like `- { - 1 }`) + would be ambiguous with an array node. + To indicate this is an object, it must be written as `(object)- { - 1 }`. + (Or, if the sole item's value is a literal, as in this example, + it can be rewritten to use properties, as `- -=1`.) +* An empty object (JSON `{}`) must use the `(object)` tag, like `(object)-`. -An object with a single property named "-" that is encoded as a child node will be interpreted as an array with a single item. The `(object)` tag can be used to mark this node as an object instead. -For example, `- { - true; }` is the array `[true]`, while `(object)- { - true; }` is the object `{"-": true}`. +As with array nodes, `(object)` can be used on any valid object node if desired. ---- @@ -92,15 +116,21 @@ Only valid JiK nodes can be encoded to JSON; if a JiK document contains an inval * A literal node is valid if it contains a single unnamed argument. -* An array node is valid if it contains only unnamed arguments and/or child nodes named "-". +* An array node is valid if it contains only unnamed arguments and/or child nodes named "-". If it contains no arguments and no child nodes, its nodename *must* have the `(array)` tag. -* An object node is valid if it contains only named properties and/or child nodes. Additionally, all "keys" must be unique within the node, whether they're encoded as property names or type annotations on node names. +* An object node is valid if it contains only named properties and/or child nodes. Additionally, all "keys" must be unique within the node, whether they're encoded as property names or child node names. If it contains no properties and no child nodes, its nodename *must* have the `(object)` tag. ---- -The name of the top-level JiK node is not taken into account. This allows for using a declarative node name instead of "-". +Note that, outside of array/object items, the nodename is not meaningful in JiK. +For simplicity, this document uses `-` for all such nodenames +(and it is recommended that an automated JSON-to-KDL converter do the same), +but this means it is possible to write a JiK object as meaningful KDL +and embed it within a larger KDL document. -It is possible to embed JiK inside KDL documents. Here's a fictitious example describing an HTTP request with a JSON body, where the `body` node is an embedded JiK node. +Here's a fictitious example describing an HTTP request with a JSON body, +where the `body` node is an embedded JiK node +that nevertheless reads as fairly natural KDL. ```kdl request "/api/cart" method="PUT" { @@ -117,3 +147,14 @@ request "/api/cart" method="PUT" { } } ``` + +The `body` node represents the JSON object + +```json +{ + "items": [ + {"id": 1234, "amount": 1}, + {"id": 2341, "amount": 2, "options": {"color": "red", "size": "XXL"}} + ] +} +``` From c8dd45a0f10d48dc2cbe2a9453b489ec2d26c829 Mon Sep 17 00:00:00 2001 From: Thomas Jollans Date: Sun, 18 Sep 2022 00:18:49 +0200 Subject: [PATCH 03/24] Mention ckdl in the README (#296) --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index d5ca1a1..53ac901 100644 --- a/README.md +++ b/README.md @@ -32,11 +32,13 @@ free to jump in and give us your 2 cents! * Dart: [kdl-dart](https://github.com/danini-the-panini/kdl-dart) * Java: [kdl4j](https://github.com/hkolbeck/kdl4j) * PHP: [kdl-php](https://github.com/kdl-org/kdl-php) -* Python: [kdl-py](https://github.com/tabatkins/kdlpy), [cuddle](https://github.com/djmattyg007/python-cuddle) +* Python: [kdl-py](https://github.com/tabatkins/kdlpy), [cuddle](https://github.com/djmattyg007/python-cuddle), [ckdl](https://github.com/tjol/ckdl) * Elixir: [kuddle](https://github.com/IceDragon200/kuddle) * XSLT: [xml2kdl](https://github.com/Devasta/XML2KDL) * Haskell: [Hustle](https://github.com/fuzzypixelz/Hustle) * .NET: [Kadlet](https://github.com/oledfish/Kadlet) +* C: [ckdl](https://github.com/tjol/ckdl) +* C++: [kdlpp](https://github.com/tjol/ckdl) (part of ckdl, requires C++20) ## Compatibility Test Suite From 0b99021180c12352c85af0087ab9a3aff28dc669 Mon Sep 17 00:00:00 2001 From: Nathan West Date: Tue, 20 Sep 2022 20:29:59 -0400 Subject: [PATCH 04/24] Improvements to string naming consistency (#299) This PR modifies string descriptions in SPEC.md to use more consistent language throughout, with the primary intention of removing long descriptions like "a property key is either an identifier or a string". There are no semantic changes to KDL here. --- SPEC.md | 76 ++++++++++++++++++++++++++++++++++----------------------- 1 file changed, 46 insertions(+), 30 deletions(-) diff --git a/SPEC.md b/SPEC.md index e2fd106..fec972b 100644 --- a/SPEC.md +++ b/SPEC.md @@ -49,8 +49,8 @@ baz ### Node Being a node-oriented language means that the real core component of any KDL -document is the "node". Every node must have a name, which is either a legal -[Identifier](#identifier), or a quoted [String](#string). +document is the "node". Every node must have a name, which is an +[Identifier](#identifier). The name may be preceded by a [Type Annotation](#type-annotation) to further clarify its type, particularly in relation to its parent node. (For example, @@ -92,13 +92,21 @@ foo 1 key="val" 3 { ### Identifier -A bare Identifier is composed of any Unicode codepoint other than [non-initial +An Identifier is either a [Bare Identifier](#bare-identifier), which is an +unquoted string like `node` or `item`, or a [String](#string), which is quoted, +like `"node"` or `"two words"`. There's no semantic difference between the +kinds of identifier; this simply allows for the use of quotes to have unusual +identifiers that are inexpressible as bare identifiers. + +### Bare Identifier + +A Bare Identifier is composed of any Unicode codepoint other than [non-initial characters](#non-initial-characters), followed by any number of Unicode codepoints other than [non-identifier characters](#non-identifier-characters), so long as this doesn't produce something confusable for a [Number](#number), [Boolean](#boolean), or [Null](#null). For example, both a [Number](#number) and an Identifier can start with `-`, but when an Identifier starts with `-` -the second character cannot be a digit. This is precicely specified in the +the second character cannot be a digit. This is precicely specified in the [Full Grammar](#full-grammar) below. Identifiers are terminated by [Whitespace](#whitespace) or @@ -106,8 +114,8 @@ Identifiers are terminated by [Whitespace](#whitespace) or ### Non-initial characters -The following characters cannot be the first character in a bare -[Identifier](#identifier): +The following characters cannot be the first character in a +[Bare Identifier](#identifier): * Any decimal digit (0-9) * Any [non-identifier characters](#non-identifier-characters) @@ -119,8 +127,7 @@ of having an identifier look like a negative number. ### Non-identifier characters -The following characters cannot be used anywhere in a bare -[Identifier](#identifier): +The following characters cannot be used anywhere in a [Bare Identifier](#identifier): * Any codepoint with hexadecimal value `0x20` or below. * Any codepoint with hexadecimal value higher than `0x10FFFF`. @@ -137,6 +144,7 @@ characters and an optional single-line comment. It must be terminated by a Following a line continuation, processing of a Node can continue as usual. #### Example + ```kdl my-node 1 2 \ // comments are ok after \ 3 4 // This is the actual end of the Node. @@ -145,8 +153,7 @@ my-node 1 2 \ // comments are ok after \ ### Property A Property is a key/value pair attached to a [Node](#node). A Property is -composed of an [Identifier](#identifier) or a [String](#string), followed -immediately by a `=`, and then a [Value](#value). +composed of an [Identifier](#identifier), followed immediately by a `=`, and then a [Value](#value). Properties should be interpreted left-to-right, with rightmost properties with identical names overriding earlier properties. That is: @@ -167,7 +174,7 @@ make it act as plain whitespace, even if it spreads across multiple lines. ### Argument An Argument is a bare [Value](#value) attached to a [Node](#node), with no -associated key. It shares the same space as [Properties](#properties). +associated key. It shares the same space as [Properties](#properties), and may be interleaved with them. A Node may have any number of Arguments, which should be evaluated left to right. KDL implementations _MUST_ preserve the order of Arguments relative to @@ -204,13 +211,14 @@ parent { child1; child2; } ### Value -A value is either: a [String](#string), a [Raw String](#raw-string), a -[Number](#number), a [Boolean](#boolean), or [Null](#null) +A value is either: a [String](#string), a [Number](#number), a +[Boolean](#boolean), or [Null](#null). Values _MUST_ be either [Arguments](#argument) or values of [Properties](#property). -Values _MAY_ be prefixed by a single [Type Annotation](#type-annotation). +Values (both as arguments and as properties) _MAY_ be prefixed by a single +[Type Annotation](#type-annotation). ### Type Annotation @@ -219,7 +227,7 @@ includes a _suggestion_ of what type the value is _intended_ to be treated as, or as a _context-specific elaboration_ of the more generic type the node name indicates. -Type annotations are written as a set of `(` and `)` with a single +Type annotations are written as a set of `(` and `)` with an [Identifier](#identifier) in it. Any valid identifier is considered a valid type annotation. There must be no whitespace between a type annotation and its associated Node Name or Value. @@ -301,11 +309,18 @@ node prop=(regex)".*" ### String -Strings in KDL represent textual [Values](#value). They are delimited by `"` -on either side of any number of literal string characters except unescaped -`"` and `\`. This includes literal [Newline](#newline) characters, which means a -String Value can encompass multiple lines without behaving like a Newline for -[Node](#node) parsing purposes. +Strings in KDL represent textual [Values](#value), or unusual identifiers. A +String is either a [Quoted String](#quoted-string) or a +[Raw String](#raw-string). Quoted Strings may include escaped characters, while +Raw Strings always contain only the literal characters that are present. + +### Quoted String + +A Quoted String is delimited by `"` on either side of any number of literal +string characters except unescaped `"` and `\`. This includes literal +[Newline](#newline) characters, which means a String Value can encompass +multiple lines without behaving like a Newline for [Node](#node) parsing +purposes. Strings _MUST_ be represented as UTF-8 values. @@ -327,16 +342,18 @@ interpreted as described in the following table: ### Raw String -Raw Strings in KDL are much like [Strings](#string), except they do not -support `\`-escapes. They otherwise share the same properties as far as +Raw Strings in KDL are much like [Quoted Strings](#quoted-string), except they +do not support `\`-escapes. They otherwise share the same properties as far as literal [Newline](#newline) characters go, and the requirement of UTF-8 representation. Raw String literals are represented as `r`, followed by zero or more `#` -characters, followed by `"`, followed by any number of UTF-8 literals. The string is then -closed by a `"` followed by a _matching_ number of `#` characters. This means -that the string sequence `"` or `"#` and such must not match the closing `"` -with the same or more `#` characters as the opening `r`. +characters, followed by `"`, followed by any number of UTF-8 literals. The +string is then closed by a `"` followed by a _matching_ number of `#` +characters. This allows them to contain raw `"` or `#` characters; only the +precise terminator (resembling `"##`, for example) ends the raw string. This +means that the string sequence `"` or `"#` and such must not match the closing +`"` with the same or more `#` characters as the opening `r`. #### Example @@ -347,10 +364,9 @@ quotes-and-escapes r#"hello\n\r\asd"world"# ### Number -Numbers in KDL represent numerical [Values](#value). There is no logical -distinction in KDL between real numbers, integers, and floating point numbers. -It's up to individual implementations to determine how to represent KDL -numbers. +Numbers in KDL represent numerical [Values](#value). There is no logical distinction in KDL +between real numbers, integers, and floating point numbers. It's up to +individual implementations to determine how to represent KDL numbers. There are four syntaxes for Numbers: Decimal, Hexadecimal, Octal, and Binary. From 20d65edb7d43b75c2e638259787aba35b8dbc273 Mon Sep 17 00:00:00 2001 From: Benjamin Kane <92759008+bkane-msft@users.noreply.github.com> Date: Fri, 23 Sep 2022 09:08:39 -0700 Subject: [PATCH 05/24] Add links to XML2KDL repo/online editor (#301) Low effort KDLing should be easy :D --- XML-IN-KDL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/XML-IN-KDL.md b/XML-IN-KDL.md index 8cb64fa..cac9d5f 100644 --- a/XML-IN-KDL.md +++ b/XML-IN-KDL.md @@ -7,7 +7,7 @@ This is version 1.0.0 of XiK. XML-in-KDL (XiK from now on) is a KDL microsyntax for losslessly encoding XML into a KDL document. XML and KDL, luckily, have *very similar* data models (KDL is *almost* a superset of XML), so it's quite straightforward to encode most XML documents into KDL. -See [the website example](examples/website.kdl) for an example of this grammar in use to encode an HTML document. +See [the website example](examples/website.kdl) for an example of this grammar in use to encode an HTML document. See [XML2KDL](https://github.com/Devasta/XML2KDL) (third party) to encode your XML in KDL (especially [their online editor](https://xsltfiddle.liberty-development.net/bET2rY5)). XML has several types of nodes, corresponding to certain KDL constructs: From 6bab316b0f02e925690c4c118d3980d231c0f4d2 Mon Sep 17 00:00:00 2001 From: Hannah Kolbeck Date: Thu, 29 Sep 2022 09:50:14 -0700 Subject: [PATCH 06/24] Remove credit from test suite README (#304) --- tests/README.md | 7 ------- 1 file changed, 7 deletions(-) diff --git a/tests/README.md b/tests/README.md index 7c5fa5e..0ddfea0 100644 --- a/tests/README.md +++ b/tests/README.md @@ -52,10 +52,3 @@ please send a PR. If you think the disagreement is due to a genuine error or oversight in the KDL specification, please open an issue explaining the matter and the change will be considered for the next version of the KDL spec. - -## Credit - -This test suite was extracted from -[`kdl4j`](https://github.com/hkolbeck/kdl4j), the original Java -implementation of KDL, with huge thanks to -[@hkolbeck](https://github.com/hkolbeck) for authoring them! From 5253595c14c177727c59d962c11cfdb334fbf73b Mon Sep 17 00:00:00 2001 From: Aram Drevekenin Date: Thu, 29 Sep 2022 18:52:25 +0200 Subject: [PATCH 07/24] docs(readme): link to vim syntax plugin (#305) --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 53ac901..660161d 100644 --- a/README.md +++ b/README.md @@ -51,6 +51,7 @@ entirety, but in the future, may be required to in order to be included here. ## Editor Support * [VS Code](https://marketplace.visualstudio.com/items?itemName=kdl-org.kdl&ssr=false#review-details) +* [vim](https://github.com/imsnif/kdl.vim) ## Overview From 8d252133b73300cd7f54070523a6dcb52a9abda0 Mon Sep 17 00:00:00 2001 From: Bannerets Date: Mon, 3 Oct 2022 16:14:53 +0000 Subject: [PATCH 08/24] Mention an OCaml implementation in the README (#307) --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 660161d..8939bb5 100644 --- a/README.md +++ b/README.md @@ -39,6 +39,7 @@ free to jump in and give us your 2 cents! * .NET: [Kadlet](https://github.com/oledfish/Kadlet) * C: [ckdl](https://github.com/tjol/ckdl) * C++: [kdlpp](https://github.com/tjol/ckdl) (part of ckdl, requires C++20) +* OCaml: [ocaml-kdl](https://github.com/Bannerets/ocaml-kdl) ## Compatibility Test Suite From 0dc4a92a69be5ca401bea49fbd6051ad4b6f0d0b Mon Sep 17 00:00:00 2001 From: Lars Willighagen Date: Mon, 10 Oct 2022 22:25:25 +0200 Subject: [PATCH 09/24] Replace use of non-defined 'tag' term (#310) Fixes: https://github.com/kdl-org/kdl/issues/306 In the specifications of KQL and JiK, replace the usage of 'tag' with 'type annotation', as that is the term in the main KDL specification. --- JSON-IN-KDL.md | 14 +++++++------- QUERY-SPEC.md | 10 +++++----- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/JSON-IN-KDL.md b/JSON-IN-KDL.md index c256612..4a0c717 100644 --- a/JSON-IN-KDL.md +++ b/JSON-IN-KDL.md @@ -41,14 +41,14 @@ Arguments and children can be mixed, if desired. The preceding example could als } ``` -Two otherwise-ambiguous cases must be manually annotated with an `(array)` tag: +Two otherwise-ambiguous cases must be manually annotated with an `(array)` type annotation: * A single-element array (such as `[1]`) written using arguments (as `- 1`) would be ambiguous with a literal node. To indicate this is an array, it must be written as `(array)- 1` (Or rewritten to use child nodes, like `- { - 1 }`.) -* An empty array (JSON `[]`) must use the `(array)` tag, like `(array)-`. +* An empty array (JSON `[]`) must use the `(array)` type annotation, like `(array)-`. -The `(array)` tag can be used on any other valid array node if desired, but has no effect in such cases. +The `(array)` type annotation can be used on any other valid array node if desired, but has no effect in such cases. ---- @@ -97,14 +97,14 @@ The properties and/or children of the node represent the items of the object, with the property names and child nodenames as each item's key. All "keys" in an object node must be unique. -As with arrays, there are two ambiguous cases that must be manually annoted with the `(object)` tag: +As with arrays, there are two ambiguous cases that must be manually annoted with the `(object)` type annotation: * An object containing a single item whose key is "-" (like `{"-": 1}`) written using children (like `- { - 1 }`) would be ambiguous with an array node. To indicate this is an object, it must be written as `(object)- { - 1 }`. (Or, if the sole item's value is a literal, as in this example, it can be rewritten to use properties, as `- -=1`.) -* An empty object (JSON `{}`) must use the `(object)` tag, like `(object)-`. +* An empty object (JSON `{}`) must use the `(object)` type annotation, like `(object)-`. As with array nodes, `(object)` can be used on any valid object node if desired. @@ -116,9 +116,9 @@ Only valid JiK nodes can be encoded to JSON; if a JiK document contains an inval * A literal node is valid if it contains a single unnamed argument. -* An array node is valid if it contains only unnamed arguments and/or child nodes named "-". If it contains no arguments and no child nodes, its nodename *must* have the `(array)` tag. +* An array node is valid if it contains only unnamed arguments and/or child nodes named "-". If it contains no arguments and no child nodes, its nodename *must* have the `(array)` type annotation. -* An object node is valid if it contains only named properties and/or child nodes. Additionally, all "keys" must be unique within the node, whether they're encoded as property names or child node names. If it contains no properties and no child nodes, its nodename *must* have the `(object)` tag. +* An object node is valid if it contains only named properties and/or child nodes. Additionally, all "keys" must be unique within the node, whether they're encoded as property names or child node names. If it contains no properties and no child nodes, its nodename *must* have the `(object)` type annotation. ---- diff --git a/QUERY-SPEC.md b/QUERY-SPEC.md index 766794f..bec0cc4 100644 --- a/QUERY-SPEC.md +++ b/QUERY-SPEC.md @@ -32,8 +32,8 @@ binary operators. * `top()`: Returns all toplevel children of the current document. * `top() > []`: Equivalent to `top()` on its own. -* `(foo)`: Selects any element with a tag named `foo`. -* `()`: Selects any element with any tag. +* `(foo)`: Selects any element whose type annotation is `foo`. +* `()`: Selects any element with any type annotation. * `[val()]`: Selects any element with a value. * `[val(1)]`: Selects any element with a second value. * `[prop(foo)]`: Selects any element with a property named `foo`. @@ -44,8 +44,8 @@ Attribute matchers support certain binary operators: * `[val() = 1]`: Selects any element whose first value is 1. * `[prop(name) = 1]`: Selects any element with a property `name` whose value is 1. * `[name = 1]`: Equivalent to the above. -* `[name() = "hi"]`: Selects any element whose _node name_ is "hi". Equivalent to just `hi`, but more useful when using string operators. -* `[tag() = "hi"]`: Selects any element whose tag is "hi". Equivalent to just `(hi)`, but more useful when using string operators. +* `[name() = "hi"]`: Selects any element whose _node name_ is `"hi"`. Equivalent to just `hi`, but more useful when using string operators. +* `[tag() = "hi"]`: Selects any element whose type annotation is `"hi"`. Equivalent to just `(hi)`, but more useful when using string operators. * `[val() != 1]`: Selects any element whose first value exists, and is not 1. The following operators work with any `val()` or `prop()` values. @@ -67,7 +67,7 @@ If the value is not a string, the matcher will always fail: The following operators work only with `val()` or `prop()` values. If the value is not one of those, the matcher will always fail: -* `[val() = (foo)]`: Selects any element whose tag is "foo". +* `[val() = (foo)]`: Selects any element whose type annotation is `foo`. ## Map Operator From 6bf9b1c588df340e162b5673ae1ac1cb7c01b6b8 Mon Sep 17 00:00:00 2001 From: Patitotective Date: Thu, 13 Oct 2022 11:13:03 -0500 Subject: [PATCH 10/24] Update README.md (#311) --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 8939bb5..85c28d3 100644 --- a/README.md +++ b/README.md @@ -40,6 +40,7 @@ free to jump in and give us your 2 cents! * C: [ckdl](https://github.com/tjol/ckdl) * C++: [kdlpp](https://github.com/tjol/ckdl) (part of ckdl, requires C++20) * OCaml: [ocaml-kdl](https://github.com/Bannerets/ocaml-kdl) +* Nim: [kdl-nim](https://github.com/Patitotective/kdl-nim) ## Compatibility Test Suite From 6feeccc491808fe0b5425a18aa7ab4a8ac435e55 Mon Sep 17 00:00:00 2001 From: Exidex <16986685+Exidex@users.noreply.github.com> Date: Tue, 18 Oct 2022 00:33:00 +0200 Subject: [PATCH 11/24] Add Intellij IDEA plugin to README.md (#312) --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 85c28d3..d807879 100644 --- a/README.md +++ b/README.md @@ -54,6 +54,7 @@ entirety, but in the future, may be required to in order to be included here. * [VS Code](https://marketplace.visualstudio.com/items?itemName=kdl-org.kdl&ssr=false#review-details) * [vim](https://github.com/imsnif/kdl.vim) +* [Intellij IDEA](https://plugins.jetbrains.com/plugin/20136-kdl-document-language) ## Overview From a3d39e7749626159fd7b7fe6fd242b462c903376 Mon Sep 17 00:00:00 2001 From: chee Date: Wed, 15 Mar 2023 16:29:32 +0000 Subject: [PATCH 12/24] Add link to common lisp implementation (#319) --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index d807879..19dc48a 100644 --- a/README.md +++ b/README.md @@ -41,6 +41,7 @@ free to jump in and give us your 2 cents! * C++: [kdlpp](https://github.com/tjol/ckdl) (part of ckdl, requires C++20) * OCaml: [ocaml-kdl](https://github.com/Bannerets/ocaml-kdl) * Nim: [kdl-nim](https://github.com/Patitotective/kdl-nim) +* Common Lisp: [kdlcl](https://github.com/chee/kdlcl) ## Compatibility Test Suite From a75ca13c15a5f4345288e6a0962bf9a08098bfad Mon Sep 17 00:00:00 2001 From: Evgeny Date: Fri, 26 May 2023 02:09:47 +0700 Subject: [PATCH 13/24] Fix a typo in SPEC.md (#323) --- SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/SPEC.md b/SPEC.md index fec972b..f2aa04c 100644 --- a/SPEC.md +++ b/SPEC.md @@ -106,7 +106,7 @@ codepoints other than [non-identifier characters](#non-identifier-characters), so long as this doesn't produce something confusable for a [Number](#number), [Boolean](#boolean), or [Null](#null). For example, both a [Number](#number) and an Identifier can start with `-`, but when an Identifier starts with `-` -the second character cannot be a digit. This is precicely specified in the +the second character cannot be a digit. This is precisely specified in the [Full Grammar](#full-grammar) below. Identifiers are terminated by [Whitespace](#whitespace) or From f3e5ff60270792ed7fb27ad3db787ff02a1de52c Mon Sep 17 00:00:00 2001 From: Tab Atkins Jr Date: Tue, 30 May 2023 14:13:46 -0700 Subject: [PATCH 14/24] Rearrange the number production to put decimal at the end While the grammar makes no statements about match order, parsers pretty universally test for decimal last, after the other number productions, because `0b010` (/etc) can look like a `0` followed by garbage. Matching this order can reduce confusion. Closes #330. --- SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/SPEC.md b/SPEC.md index f2aa04c..aa75d85 100644 --- a/SPEC.md +++ b/SPEC.md @@ -484,7 +484,7 @@ raw-string := 'r' raw-string-hash raw-string-hash := '#' raw-string-hash '#' | raw-string-quotes raw-string-quotes := '"' .* '"' -number := decimal | hex | octal | binary +number := hex | octal | binary | decimal decimal := sign? integer ('.' integer)? exponent? exponent := ('e' | 'E') sign? integer From 09801faa9368e1407e2f081d1b37d3874f34ddb6 Mon Sep 17 00:00:00 2001 From: Danielle Smith Date: Sat, 1 Jul 2023 00:06:31 +0200 Subject: [PATCH 15/24] add playground link to README.md (#332) --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 19dc48a..c73161b 100644 --- a/README.md +++ b/README.md @@ -18,6 +18,8 @@ modifications and clarifications on its syntax and behavior. The current version of the KDL spec is `1.0.0`. +[Play with it in your browser!](https://kdl-play.danini.dev/) + ## Design and Discussion KDL is still extremely new, and discussion about the format should happen over From 11d8e912fcec6d5e485d31ee66a9fe436d8b73de Mon Sep 17 00:00:00 2001 From: Evgeny Date: Thu, 27 Jul 2023 04:34:57 +0700 Subject: [PATCH 16/24] add Sublime Text support (#326) --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index c73161b..daf86c1 100644 --- a/README.md +++ b/README.md @@ -56,6 +56,7 @@ entirety, but in the future, may be required to in order to be included here. ## Editor Support * [VS Code](https://marketplace.visualstudio.com/items?itemName=kdl-org.kdl&ssr=false#review-details) +* [Sublime Text](https://packagecontrol.io/packages/KDL) * [vim](https://github.com/imsnif/kdl.vim) * [Intellij IDEA](https://plugins.jetbrains.com/plugin/20136-kdl-document-language) From 6fa99c2586f4c8f4134f6930f9ebecbfc35df4f6 Mon Sep 17 00:00:00 2001 From: Jonathan Date: Tue, 26 Sep 2023 21:18:51 +0200 Subject: [PATCH 17/24] Add link to Go implementation (#334) --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index daf86c1..5da2c95 100644 --- a/README.md +++ b/README.md @@ -44,6 +44,7 @@ free to jump in and give us your 2 cents! * OCaml: [ocaml-kdl](https://github.com/Bannerets/ocaml-kdl) * Nim: [kdl-nim](https://github.com/Patitotective/kdl-nim) * Common Lisp: [kdlcl](https://github.com/chee/kdlcl) +* Go: [gokdl](https://github.com/lunjon/gokdl) ## Compatibility Test Suite From 652590fad3d842883e971d030a152720ef592a92 Mon Sep 17 00:00:00 2001 From: Tab Atkins-Bittner Date: Fri, 6 Oct 2023 13:56:03 -0700 Subject: [PATCH 18/24] Allow single-line comments with nothing after them. Fixes #318 --- SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/SPEC.md b/SPEC.md index aa75d85..4b28932 100644 --- a/SPEC.md +++ b/SPEC.md @@ -510,7 +510,7 @@ bom := '\u{FEFF}' unicode-space := See Table (All White_Space unicode characters which are not `newline`) -single-line-comment := '//' ^newline+ (newline | eof) +single-line-comment := '//' ^newline* (newline | eof) multi-line-comment := '/*' commented-block commented-block := '*/' | (multi-line-comment | '*' | '/' | [^*/]+) commented-block ``` From ef1bb689b0853ed82c399e7e7351f9e89e22de4f Mon Sep 17 00:00:00 2001 From: Tab Atkins-Bittner Date: Fri, 6 Oct 2023 13:58:35 -0700 Subject: [PATCH 19/24] Add vertical tab to whitespace characters. Fixes #331 --- SPEC.md | 1 + 1 file changed, 1 insertion(+) diff --git a/SPEC.md b/SPEC.md index 4b28932..d1e56fc 100644 --- a/SPEC.md +++ b/SPEC.md @@ -413,6 +413,7 @@ space](https://www.unicode.org/Public/UCD/latest/ucd/PropList.txt): | Name | Code Pt | |----------------------|---------| | Character Tabulation | `U+0009` | +| Line Tabulation | `U+000B` | | Space | `U+0020` | | No-Break Space | `U+00A0` | | Ogham Space Mark | `U+1680` | From 54f5fc80256f736198690ffbccd0f729585eb418 Mon Sep 17 00:00:00 2001 From: Tab Atkins-Bittner Date: Fri, 6 Oct 2023 14:07:28 -0700 Subject: [PATCH 20/24] Revert "Add vertical tab to whitespace characters. Fixes #331" This reverts commit ef1bb689b0853ed82c399e7e7351f9e89e22de4f. --- SPEC.md | 1 - 1 file changed, 1 deletion(-) diff --git a/SPEC.md b/SPEC.md index d1e56fc..4b28932 100644 --- a/SPEC.md +++ b/SPEC.md @@ -413,7 +413,6 @@ space](https://www.unicode.org/Public/UCD/latest/ucd/PropList.txt): | Name | Code Pt | |----------------------|---------| | Character Tabulation | `U+0009` | -| Line Tabulation | `U+000B` | | Space | `U+0020` | | No-Break Space | `U+00A0` | | Ogham Space Mark | `U+1680` | From 270c60ca9a3d05454b630dc3d8d6fbd6812efd10 Mon Sep 17 00:00:00 2001 From: Tab Atkins-Bittner Date: Fri, 6 Oct 2023 14:07:29 -0700 Subject: [PATCH 21/24] Revert "Allow single-line comments with nothing after them. Fixes #318" This reverts commit 652590fad3d842883e971d030a152720ef592a92. --- SPEC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/SPEC.md b/SPEC.md index 4b28932..aa75d85 100644 --- a/SPEC.md +++ b/SPEC.md @@ -510,7 +510,7 @@ bom := '\u{FEFF}' unicode-space := See Table (All White_Space unicode characters which are not `newline`) -single-line-comment := '//' ^newline* (newline | eof) +single-line-comment := '//' ^newline+ (newline | eof) multi-line-comment := '/*' commented-block commented-block := '*/' | (multi-line-comment | '*' | '/' | [^*/]+) commented-block ``` From 9f10522717225ecadfea167b47d444d2bc8921a9 Mon Sep 17 00:00:00 2001 From: sblinch Date: Tue, 17 Oct 2023 09:23:57 -0700 Subject: [PATCH 22/24] Add kdl-go to README.md (#336) --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 5da2c95..78cfcb7 100644 --- a/README.md +++ b/README.md @@ -44,7 +44,7 @@ free to jump in and give us your 2 cents! * OCaml: [ocaml-kdl](https://github.com/Bannerets/ocaml-kdl) * Nim: [kdl-nim](https://github.com/Patitotective/kdl-nim) * Common Lisp: [kdlcl](https://github.com/chee/kdlcl) -* Go: [gokdl](https://github.com/lunjon/gokdl) +* Go: [gokdl](https://github.com/lunjon/gokdl), [kdl-go](https://github.com/sblinch/kdl-go) ## Compatibility Test Suite From 7b7d57bf29d1b2f103373cbab4d416899554f615 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kat=20March=C3=A1n?= Date: Tue, 17 Oct 2023 09:55:32 -0700 Subject: [PATCH 23/24] lead with a more complete example in the readme --- README.md | 39 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 36 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 78cfcb7..090fdf8 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,42 @@ # The KDL Document Language -KDL is a document language with xml-like semantics that looks like you're -invoking a bunch of CLI commands! It's meant to be used both as a +KDL is a small, pleasing document language with xml-like semantics that looks +like you're invoking a bunch of CLI commands! It's meant to be used both as a serialization format and a configuration language, much like JSON, YAML, or -XML. +XML. It looks like this: + +```kdl +package { + name "my-pkg" + version "1.2.3" + + dependencies { + // Nodes can have standalone values as well as key/value pairs. + lodash "^3.2.1" optional=true alias="underscore" + } + + scripts { + // "Raw" and multi-line strings are supported. + build r#" + echo "foo" + node -c "console.log('hello, world!');" + echo "foo" > some-file.txt + "# + } + + // `\` breaks up a single node across multiple lines. + the-matrix 1 2 3 \ + 4 5 6 \ + 7 8 9 + + // "Slashdash" comments operate at the node level, with just `/-`. + /-this-is-commented { + this "entire" "node" { + "is" "gone" + } + } +} +``` There's a living [specification](SPEC.md), as well as various [implementations](#implementations). You can also check out the [FAQ](#faq) to From ef93a6b10c4e16d94194280bb6687661d7024476 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kat=20March=C3=A1n?= Date: Tue, 17 Oct 2023 10:00:44 -0700 Subject: [PATCH 24/24] wrap comments --- README.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 090fdf8..624313e 100644 --- a/README.md +++ b/README.md @@ -11,7 +11,8 @@ package { version "1.2.3" dependencies { - // Nodes can have standalone values as well as key/value pairs. + // Nodes can have standalone values as well as + // key/value pairs. lodash "^3.2.1" optional=true alias="underscore" } @@ -29,7 +30,8 @@ package { 4 5 6 \ 7 8 9 - // "Slashdash" comments operate at the node level, with just `/-`. + // "Slashdash" comments operate at the node level, + // with just `/-`. /-this-is-commented { this "entire" "node" { "is" "gone"