Merge branch 'kdl-v2' into ds-fix-weird-escape-char

This commit is contained in:
Kat Marchán 2022-08-28 12:58:40 -07:00 committed by GitHub
commit 8ddfef240c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
29 changed files with 75 additions and 49 deletions

View File

@ -26,14 +26,17 @@ free to jump in and give us your 2 cents!
## Implementations ## Implementations
* Rust: [kdl-rs](https://github.com/kdl-org/kdl-rs) * Rust: [kdl-rs](https://github.com/kdl-org/kdl-rs), [knuffel](https://crates.io/crates/knuffel/) (latter includes derive macro), and [kaydle](https://github.com/Lucretiel/kaydle) (serde-based)
* JavaScript: [kdljs](https://github.com/kdl-org/kdljs) * JavaScript: [kdljs](https://github.com/kdl-org/kdljs), [@virtualstate/kdl](https://github.com/virtualstate/kdl) (query only, JSX based)
* Ruby: [kdl-rb](https://github.com/danini-the-panini/kdl-rb) * Ruby: [kdl-rb](https://github.com/danini-the-panini/kdl-rb)
* Dart: [kdl-dart](https://github.com/danini-the-panini/kdl-dart) * Dart: [kdl-dart](https://github.com/danini-the-panini/kdl-dart)
* Java: [kdl4j](https://github.com/hkolbeck/kdl4j) * Java: [kdl4j](https://github.com/hkolbeck/kdl4j)
* PHP: [kdl-php](https://github.com/kdl-org/kdl-php) * PHP: [kdl-php](https://github.com/kdl-org/kdl-php)
* Python: [kdl-py](https://github.com/daeken/kdl-py) * Python: [kdl-py](https://github.com/tabatkins/kdlpy), [cuddle](https://github.com/djmattyg007/python-cuddle)
* Elixir: [kuddle](https://github.com/IceDragon200/kuddle) * Elixir: [kuddle](https://github.com/IceDragon200/kuddle)
* XSLT: [xml2kdl](https://github.com/Devasta/XML2KDL)
* Haskell: [Hustle](https://github.com/fuzzypixelz/Hustle)
* .NET: [Kadlet](https://github.com/oledfish/Kadlet)
## Compatibility Test Suite ## Compatibility Test Suite

55
SPEC.md
View File

@ -18,6 +18,12 @@ rules, with some semantic exceptions involving the data model.
KDL is designed to be easy to read _and_ easy to implement. KDL is designed to be easy to read _and_ easy to implement.
In this document, references to "left" or "right" refer to directions in the
*data stream* towards the beginning or end, respectively; in other words,
the directions if the data stream were only ASCII text. They do not refer
to the writing direction of text, which can flow in either direction,
depending on the characters used.
## Components ## Components
### Document ### Document
@ -57,8 +63,12 @@ slash-escaped line continuation](#line-continuation). Arguments and Properties
may be interspersed in any order, much like is common with positional may be interspersed in any order, much like is common with positional
arguments vs options in command line tools. arguments vs options in command line tools.
Arguments are ordered relative to each other and that order must be preserved [Children](#children-block) can be placed after the name and the optional
in order to maintain the semantics. Arguments and Properties, possibly separated by either whitespace or a
slash-escaped line continuation.
Arguments are ordered relative to each other (but not relative to Properties)
and that order must be preserved in order to maintain the semantics.
By contrast, Property order _SHOULD NOT_ matter to implementations. By contrast, Property order _SHOULD NOT_ matter to implementations.
[Children](#children-block) should be used if an order-sensitive key/value [Children](#children-block) should be used if an order-sensitive key/value
@ -68,9 +78,8 @@ Nodes _MAY_ be prefixed with `/-` to "comment out" the entire node, including
its properties, arguments, and children, and make it act as plain whitespace, its properties, arguments, and children, and make it act as plain whitespace,
even if it spreads across multiple lines. even if it spreads across multiple lines.
Finally, a node is terminated by either a [Newline](#newline), a [Children Finally, a node is terminated by either a [Newline](#newline), a semicolon (`;`)
Block](#children-block), a semicolon (`;`) or the end of the file/stream (an or the end of the file/stream (an `EOF`).
`EOF`).
#### Example #### Example
@ -87,7 +96,10 @@ A bare Identifier is composed of any Unicode codepoint other than [non-initial
characters](#non-initial-characters), followed by any number of Unicode characters](#non-initial-characters), followed by any number of Unicode
codepoints other than [non-identifier characters](#non-identifier-characters), codepoints other than [non-identifier characters](#non-identifier-characters),
so long as this doesn't produce something confusable for a [Number](#number), so long as this doesn't produce something confusable for a [Number](#number),
[Boolean](#boolean), or [Null](#null). [Boolean](#boolean), or [Null](#null). For example, both a [Number](#number)
and an Identifier can start with `-`, but when an Identifier starts with `-`
the second character cannot be a digit. This is precicely specified in the
[Full Grammar](#full-grammar) below.
Identifiers are terminated by [Whitespace](#whitespace) or Identifiers are terminated by [Whitespace](#whitespace) or
[Newlines](#newline). [Newlines](#newline).
@ -100,6 +112,11 @@ The following characters cannot be the first character in a bare
* Any decimal digit (0-9) * Any decimal digit (0-9)
* Any [non-identifier characters](#non-identifier-characters) * Any [non-identifier characters](#non-identifier-characters)
Be aware that the `-` character can only be used as an initial
character if the second character is not a digit. This allows
identifiers to look like `--this`, and removes the ambiguity
of having an identifier look like a negative number.
### Non-identifier characters ### Non-identifier characters
The following characters cannot be used anywhere in a bare The following characters cannot be used anywhere in a bare
@ -168,7 +185,7 @@ my-node 1 2 3 "a" "b" "c"
### Children Block ### Children Block
A children block is a block of [Nodes](#node), surrounded by `{` and `}`. They A children block is a block of [Nodes](#node), surrounded by `{` and `}`. They
are an optional terminator for nodes, and create a hierarchy of KDL nodes. are an optional part of nodes, and create a hierarchy of KDL nodes.
Regular node termination rules apply, which means multiple nodes can be Regular node termination rules apply, which means multiple nodes can be
included in a single-line children block, as long as they're all terminated by included in a single-line children block, as long as they're all terminated by
@ -258,10 +275,10 @@ IEEE 754-2008 decimal floating point numbers
* `country-2`: ISO 3166-1 alpha-2 country code. * `country-2`: ISO 3166-1 alpha-2 country code.
* `country-3`: ISO 3166-1 alpha-3 country code. * `country-3`: ISO 3166-1 alpha-3 country code.
* `country-subdivision`: ISO 3166-2 country subdivision code. * `country-subdivision`: ISO 3166-2 country subdivision code.
* `email`: RFC5302 email address. * `email`: RFC5322 email address.
* `idn-email`: RFC6531 internationalized email address. * `idn-email`: RFC6531 internationalized email address.
* `hostname`: RFC1132 internet hostname. * `hostname`: RFC1132 internet hostname (only ASCII segments)
* `idn-hostname`: RFC5890 internationalized internet hostname. * `idn-hostname`: RFC5890 internationalized internet hostname (only `xn--`-prefixed ASCII "punycode" segments, or non-ASCII segments)
* `ipv4`: RFC2673 dotted-quad IPv4 address. * `ipv4`: RFC2673 dotted-quad IPv4 address.
* `ipv6`: RFC2373 IPv6 address. * `ipv6`: RFC2373 IPv6 address.
* `url`: RFC3986 URI. * `url`: RFC3986 URI.
@ -397,6 +414,13 @@ space](https://www.unicode.org/Public/UCD/latest/ucd/PropList.txt):
| Medium Mathematical Space | `U+205F` | | Medium Mathematical Space | `U+205F` |
| Ideographic Space | `U+3000` | | Ideographic Space | `U+3000` |
#### Multi-line comments
In addition to single-line comments using `//`, comments can also be started
with `/*` and ended with `*/`. These comments can span multiple lines. They
are allowed in all positions where [Whitespace](#whitespace) is allowed and
can be nested.
### Newline ### Newline
The following characters [should be treated as new The following characters [should be treated as new
@ -419,8 +443,8 @@ Note that for the purpose of new lines, CRLF is considered _a single newline_.
``` ```
nodes := linespace* (node nodes?)? linespace* nodes := linespace* (node nodes?)? linespace*
node := ('/-' node-space*)? type? identifier (node-space node-space* node-props-and-args)* (node-space* node-children ws*)? node-space* node-terminator node := ('/-' node-space*)? type? identifier (node-space+ node-prop-or-arg)* (node-space* node-children ws*)? node-space* node-terminator
node-props-and-args := ('/-' node-space*)? (prop | value) node-prop-or-arg := ('/-' node-space*)? (prop | value)
node-children := ('/-' node-space*)? '{' nodes '}' node-children := ('/-' node-space*)? '{' nodes '}'
node-space := ws* escline ws* | ws+ node-space := ws* escline ws* | ws+
node-terminator := single-line-comment | newline | ';' | eof node-terminator := single-line-comment | newline | ';' | eof
@ -445,9 +469,10 @@ raw-string-quotes := '"' .* '"'
number := decimal | hex | octal | binary number := decimal | hex | octal | binary
decimal := integer ('.' [0-9] [0-9_]*)? exponent? decimal := sign? integer ('.' integer)? exponent?
exponent := ('e' | 'E') integer exponent := ('e' | 'E') sign? integer
integer := sign? [0-9] [0-9_]* integer := digit (digit | '_')*
digit := [0-9]
sign := '+' | '-' sign := '+' | '-'
hex := sign? '0x' hex-digit (hex-digit | '_')* hex := sign? '0x' hex-digit (hex-digit | '_')*

View File

@ -78,6 +78,9 @@ document {
node "contributor" description="Contributor to the schema" { node "contributor" description="Contributor to the schema" {
value ref=r#"[id="info-person-name"]"# value ref=r#"[id="info-person-name"]"#
prop ref=r#"[id="info-orcid"]"# prop ref=r#"[id="info-orcid"]"#
children {
node ref=r#"[id="info-link"]"#
}
} }
node "link" id="info-link" description="Links to itself, and to sources describing it" { node "link" id="info-link" description="Links to itself, and to sources describing it" {
value description="A URL that the link points to" { value description="A URL that the link points to" {
@ -228,7 +231,7 @@ document {
children id="validations" description="General value validations." { children id="validations" description="General value validations." {
node "tag" id="value-tag-node" description="The tags associated with this value" { node "tag" id="value-tag-node" description="The tags associated with this value" {
max 1 max 1
children ref="[id="validations"]" children ref=r#"[id="validations"]"#
} }
node "type" description="The type for this prop's value." { node "type" description="The type for this prop's value." {
max 1 max 1
@ -269,7 +272,7 @@ document {
min 1 min 1
type "string" type "string"
// https://json-schema.org/understanding-json-schema/reference/string.html#format // https://json-schema.org/understanding-json-schema/reference/string.html#format
enum "date-time" "date" "time" "duration" "decimal" "currency" "country-2" "country-3" "country-subdivision" "email" "idn-email" "hostname" "idn-hostname" "ipv4" "ipv6" "url" "url-reference" "irl", "irl-reference" "url-template" "regex" "uuid" "kdl-query" "i8" "i16" "i32" "i64" "u8" "u16" "u32" "u64" "isize" "usize" "f32" "f64" "decimal64" "decimal128" enum "date-time" "date" "time" "duration" "decimal" "currency" "country-2" "country-3" "country-subdivision" "email" "idn-email" "hostname" "idn-hostname" "ipv4" "ipv6" "url" "url-reference" "irl" "irl-reference" "url-template" "regex" "uuid" "kdl-query" "i8" "i16" "i32" "i64" "u8" "u16" "u32" "u64" "isize" "usize" "f32" "f64" "decimal64" "decimal128"
} }
} }
node "%" description="Only used for numeric values. Constrains them to be multiples of the given number(s)" { node "%" description="Only used for numeric values. Constrains them to be multiples of the given number(s)" {

View File

@ -1 +1 @@
node "\"\\\b\f\n\r\t" node "\"\\/\b\f\n\r\t"

View File

@ -1 +1 @@
node (type)0x10 node (type)16

View File

@ -1 +1 @@
node 0b10 node 2

View File

@ -1 +1 @@
node 0b10 node 2

View File

@ -1 +1 @@
node 0b10 node 2

View File

@ -1,2 +1 @@
node { node
}

View File

@ -1,2 +1 @@
node { node
}

View File

@ -1,2 +1 @@
node { node
}

View File

@ -1,2 +1 @@
node { node
}

View File

@ -1,2 +0,0 @@
node1
node2

View File

@ -1 +1 @@
node 0xabcdef1234567890 node 12379813812177893520

View File

@ -1 +1 @@
node 0xabcdef0123456789abcdef node 207698809136909011942886895

View File

@ -1 +1 @@
node 0xabcdef0123 node 737894400291

View File

@ -1 +1 @@
node 0x1 node 1

View File

@ -1 +1 @@
node 0b1 node 1

View File

@ -1 +1 @@
node 0o1 node 1

View File

@ -1 +1 @@
node 0o76543210 node 16434824

View File

@ -1 +1 @@
node 1 1.0 1.0E+10 1.0E-10 0x1 0o7 0b10 "arg" "arg\\\\" true false null node 1 1.0 1.0E+10 1.0E-10 1 7 2 "arg" "arg\\\\" true false null

View File

@ -1 +0,0 @@
node key=(type)0x10

View File

@ -1,2 +1 @@
node1 { node1
}

View File

@ -0,0 +1 @@
node arg="correct"

View File

@ -1 +1 @@
node 0x123abc node 1194684

View File

@ -1 +1 @@
node 0o123 node 83

View File

@ -0,0 +1 @@
node 1.02

View File

@ -1 +1 @@
node 0o1234567 node 342391

View File

@ -0,0 +1 @@
node arg="correct" /- arg="wrong"