diff --git a/README.md b/README.md index 739142c..d5ca1a1 100644 --- a/README.md +++ b/README.md @@ -26,14 +26,17 @@ free to jump in and give us your 2 cents! ## Implementations -* Rust: [kdl-rs](https://github.com/kdl-org/kdl-rs) -* JavaScript: [kdljs](https://github.com/kdl-org/kdljs) +* Rust: [kdl-rs](https://github.com/kdl-org/kdl-rs), [knuffel](https://crates.io/crates/knuffel/) (latter includes derive macro), and [kaydle](https://github.com/Lucretiel/kaydle) (serde-based) +* JavaScript: [kdljs](https://github.com/kdl-org/kdljs), [@virtualstate/kdl](https://github.com/virtualstate/kdl) (query only, JSX based) * Ruby: [kdl-rb](https://github.com/danini-the-panini/kdl-rb) * Dart: [kdl-dart](https://github.com/danini-the-panini/kdl-dart) * Java: [kdl4j](https://github.com/hkolbeck/kdl4j) * PHP: [kdl-php](https://github.com/kdl-org/kdl-php) -* Python: [kdl-py](https://github.com/daeken/kdl-py) +* Python: [kdl-py](https://github.com/tabatkins/kdlpy), [cuddle](https://github.com/djmattyg007/python-cuddle) * Elixir: [kuddle](https://github.com/IceDragon200/kuddle) +* XSLT: [xml2kdl](https://github.com/Devasta/XML2KDL) +* Haskell: [Hustle](https://github.com/fuzzypixelz/Hustle) +* .NET: [Kadlet](https://github.com/oledfish/Kadlet) ## Compatibility Test Suite diff --git a/SPEC.md b/SPEC.md index bfde671..f625ba1 100644 --- a/SPEC.md +++ b/SPEC.md @@ -18,6 +18,12 @@ rules, with some semantic exceptions involving the data model. KDL is designed to be easy to read _and_ easy to implement. +In this document, references to "left" or "right" refer to directions in the +*data stream* towards the beginning or end, respectively; in other words, +the directions if the data stream were only ASCII text. They do not refer +to the writing direction of text, which can flow in either direction, +depending on the characters used. + ## Components ### Document @@ -57,8 +63,12 @@ slash-escaped line continuation](#line-continuation). Arguments and Properties may be interspersed in any order, much like is common with positional arguments vs options in command line tools. -Arguments are ordered relative to each other and that order must be preserved -in order to maintain the semantics. +[Children](#children-block) can be placed after the name and the optional +Arguments and Properties, possibly separated by either whitespace or a +slash-escaped line continuation. + +Arguments are ordered relative to each other (but not relative to Properties) +and that order must be preserved in order to maintain the semantics. By contrast, Property order _SHOULD NOT_ matter to implementations. [Children](#children-block) should be used if an order-sensitive key/value @@ -68,9 +78,8 @@ Nodes _MAY_ be prefixed with `/-` to "comment out" the entire node, including its properties, arguments, and children, and make it act as plain whitespace, even if it spreads across multiple lines. -Finally, a node is terminated by either a [Newline](#newline), a [Children -Block](#children-block), a semicolon (`;`) or the end of the file/stream (an -`EOF`). +Finally, a node is terminated by either a [Newline](#newline), a semicolon (`;`) +or the end of the file/stream (an `EOF`). #### Example @@ -87,7 +96,10 @@ A bare Identifier is composed of any Unicode codepoint other than [non-initial characters](#non-initial-characters), followed by any number of Unicode codepoints other than [non-identifier characters](#non-identifier-characters), so long as this doesn't produce something confusable for a [Number](#number), -[Boolean](#boolean), or [Null](#null). +[Boolean](#boolean), or [Null](#null). For example, both a [Number](#number) +and an Identifier can start with `-`, but when an Identifier starts with `-` +the second character cannot be a digit. This is precicely specified in the +[Full Grammar](#full-grammar) below. Identifiers are terminated by [Whitespace](#whitespace) or [Newlines](#newline). @@ -100,6 +112,11 @@ The following characters cannot be the first character in a bare * Any decimal digit (0-9) * Any [non-identifier characters](#non-identifier-characters) +Be aware that the `-` character can only be used as an initial +character if the second character is not a digit. This allows +identifiers to look like `--this`, and removes the ambiguity +of having an identifier look like a negative number. + ### Non-identifier characters The following characters cannot be used anywhere in a bare @@ -168,7 +185,7 @@ my-node 1 2 3 "a" "b" "c" ### Children Block A children block is a block of [Nodes](#node), surrounded by `{` and `}`. They -are an optional terminator for nodes, and create a hierarchy of KDL nodes. +are an optional part of nodes, and create a hierarchy of KDL nodes. Regular node termination rules apply, which means multiple nodes can be included in a single-line children block, as long as they're all terminated by @@ -258,10 +275,10 @@ IEEE 754-2008 decimal floating point numbers * `country-2`: ISO 3166-1 alpha-2 country code. * `country-3`: ISO 3166-1 alpha-3 country code. * `country-subdivision`: ISO 3166-2 country subdivision code. -* `email`: RFC5302 email address. +* `email`: RFC5322 email address. * `idn-email`: RFC6531 internationalized email address. -* `hostname`: RFC1132 internet hostname. -* `idn-hostname`: RFC5890 internationalized internet hostname. +* `hostname`: RFC1132 internet hostname (only ASCII segments) +* `idn-hostname`: RFC5890 internationalized internet hostname (only `xn--`-prefixed ASCII "punycode" segments, or non-ASCII segments) * `ipv4`: RFC2673 dotted-quad IPv4 address. * `ipv6`: RFC2373 IPv6 address. * `url`: RFC3986 URI. @@ -397,6 +414,13 @@ space](https://www.unicode.org/Public/UCD/latest/ucd/PropList.txt): | Medium Mathematical Space | `U+205F` | | Ideographic Space | `U+3000` | +#### Multi-line comments + +In addition to single-line comments using `//`, comments can also be started +with `/*` and ended with `*/`. These comments can span multiple lines. They +are allowed in all positions where [Whitespace](#whitespace) is allowed and +can be nested. + ### Newline The following characters [should be treated as new @@ -419,8 +443,8 @@ Note that for the purpose of new lines, CRLF is considered _a single newline_. ``` nodes := linespace* (node nodes?)? linespace* -node := ('/-' node-space*)? type? identifier (node-space node-space* node-props-and-args)* (node-space* node-children ws*)? node-space* node-terminator -node-props-and-args := ('/-' node-space*)? (prop | value) +node := ('/-' node-space*)? type? identifier (node-space+ node-prop-or-arg)* (node-space* node-children ws*)? node-space* node-terminator +node-prop-or-arg := ('/-' node-space*)? (prop | value) node-children := ('/-' node-space*)? '{' nodes '}' node-space := ws* escline ws* | ws+ node-terminator := single-line-comment | newline | ';' | eof @@ -445,9 +469,10 @@ raw-string-quotes := '"' .* '"' number := decimal | hex | octal | binary -decimal := integer ('.' [0-9] [0-9_]*)? exponent? -exponent := ('e' | 'E') integer -integer := sign? [0-9] [0-9_]* +decimal := sign? integer ('.' integer)? exponent? +exponent := ('e' | 'E') sign? integer +integer := digit (digit | '_')* +digit := [0-9] sign := '+' | '-' hex := sign? '0x' hex-digit (hex-digit | '_')* diff --git a/examples/kdl-schema.kdl b/examples/kdl-schema.kdl index 667c9fd..76a1080 100644 --- a/examples/kdl-schema.kdl +++ b/examples/kdl-schema.kdl @@ -78,6 +78,9 @@ document { node "contributor" description="Contributor to the schema" { value ref=r#"[id="info-person-name"]"# prop ref=r#"[id="info-orcid"]"# + children { + node ref=r#"[id="info-link"]"# + } } node "link" id="info-link" description="Links to itself, and to sources describing it" { value description="A URL that the link points to" { @@ -228,7 +231,7 @@ document { children id="validations" description="General value validations." { node "tag" id="value-tag-node" description="The tags associated with this value" { max 1 - children ref="[id="validations"]" + children ref=r#"[id="validations"]"# } node "type" description="The type for this prop's value." { max 1 @@ -269,7 +272,7 @@ document { min 1 type "string" // https://json-schema.org/understanding-json-schema/reference/string.html#format - enum "date-time" "date" "time" "duration" "decimal" "currency" "country-2" "country-3" "country-subdivision" "email" "idn-email" "hostname" "idn-hostname" "ipv4" "ipv6" "url" "url-reference" "irl", "irl-reference" "url-template" "regex" "uuid" "kdl-query" "i8" "i16" "i32" "i64" "u8" "u16" "u32" "u64" "isize" "usize" "f32" "f64" "decimal64" "decimal128" + enum "date-time" "date" "time" "duration" "decimal" "currency" "country-2" "country-3" "country-subdivision" "email" "idn-email" "hostname" "idn-hostname" "ipv4" "ipv6" "url" "url-reference" "irl" "irl-reference" "url-template" "regex" "uuid" "kdl-query" "i8" "i16" "i32" "i64" "u8" "u16" "u32" "u64" "isize" "usize" "f32" "f64" "decimal64" "decimal128" } } node "%" description="Only used for numeric values. Constrains them to be multiples of the given number(s)" { diff --git a/tests/test_cases/expected_kdl/all_escapes.kdl b/tests/test_cases/expected_kdl/all_escapes.kdl index 024cda2..c25f434 100644 --- a/tests/test_cases/expected_kdl/all_escapes.kdl +++ b/tests/test_cases/expected_kdl/all_escapes.kdl @@ -1 +1 @@ -node "\"\\\b\f\n\r\t" +node "\"\\/\b\f\n\r\t" diff --git a/tests/test_cases/expected_kdl/arg_hex_type.kdl b/tests/test_cases/expected_kdl/arg_hex_type.kdl index ec44f6c..b1a494a 100644 --- a/tests/test_cases/expected_kdl/arg_hex_type.kdl +++ b/tests/test_cases/expected_kdl/arg_hex_type.kdl @@ -1 +1 @@ -node (type)0x10 +node (type)16 diff --git a/tests/test_cases/expected_kdl/binary.kdl b/tests/test_cases/expected_kdl/binary.kdl index 5d111b9..d14213e 100644 --- a/tests/test_cases/expected_kdl/binary.kdl +++ b/tests/test_cases/expected_kdl/binary.kdl @@ -1 +1 @@ -node 0b10 +node 2 diff --git a/tests/test_cases/expected_kdl/binary_trailing_underscore.kdl b/tests/test_cases/expected_kdl/binary_trailing_underscore.kdl index 5d111b9..d14213e 100644 --- a/tests/test_cases/expected_kdl/binary_trailing_underscore.kdl +++ b/tests/test_cases/expected_kdl/binary_trailing_underscore.kdl @@ -1 +1 @@ -node 0b10 +node 2 diff --git a/tests/test_cases/expected_kdl/binary_underscore.kdl b/tests/test_cases/expected_kdl/binary_underscore.kdl index 5d111b9..d14213e 100644 --- a/tests/test_cases/expected_kdl/binary_underscore.kdl +++ b/tests/test_cases/expected_kdl/binary_underscore.kdl @@ -1 +1 @@ -node 0b10 +node 2 diff --git a/tests/test_cases/expected_kdl/empty_child.kdl b/tests/test_cases/expected_kdl/empty_child.kdl index a166b33..64f5a0a 100644 --- a/tests/test_cases/expected_kdl/empty_child.kdl +++ b/tests/test_cases/expected_kdl/empty_child.kdl @@ -1,2 +1 @@ -node { -} +node diff --git a/tests/test_cases/expected_kdl/empty_child_different_lines.kdl b/tests/test_cases/expected_kdl/empty_child_different_lines.kdl index a166b33..64f5a0a 100644 --- a/tests/test_cases/expected_kdl/empty_child_different_lines.kdl +++ b/tests/test_cases/expected_kdl/empty_child_different_lines.kdl @@ -1,2 +1 @@ -node { -} +node diff --git a/tests/test_cases/expected_kdl/empty_child_same_line.kdl b/tests/test_cases/expected_kdl/empty_child_same_line.kdl index a166b33..64f5a0a 100644 --- a/tests/test_cases/expected_kdl/empty_child_same_line.kdl +++ b/tests/test_cases/expected_kdl/empty_child_same_line.kdl @@ -1,2 +1 @@ -node { -} +node diff --git a/tests/test_cases/expected_kdl/empty_child_whitespace.kdl b/tests/test_cases/expected_kdl/empty_child_whitespace.kdl index a166b33..64f5a0a 100644 --- a/tests/test_cases/expected_kdl/empty_child_whitespace.kdl +++ b/tests/test_cases/expected_kdl/empty_child_whitespace.kdl @@ -1,2 +1 @@ -node { -} +node diff --git a/tests/test_cases/expected_kdl/escline_comment_node.kdl b/tests/test_cases/expected_kdl/escline_comment_node.kdl deleted file mode 100644 index 1c5b5f3..0000000 --- a/tests/test_cases/expected_kdl/escline_comment_node.kdl +++ /dev/null @@ -1,2 +0,0 @@ -node1 -node2 diff --git a/tests/test_cases/expected_kdl/hex.kdl b/tests/test_cases/expected_kdl/hex.kdl index 6d1eba2..bcbc7ff 100644 --- a/tests/test_cases/expected_kdl/hex.kdl +++ b/tests/test_cases/expected_kdl/hex.kdl @@ -1 +1 @@ -node 0xabcdef1234567890 +node 12379813812177893520 diff --git a/tests/test_cases/expected_kdl/hex_int.kdl b/tests/test_cases/expected_kdl/hex_int.kdl index b552b7b..f8dcee1 100644 --- a/tests/test_cases/expected_kdl/hex_int.kdl +++ b/tests/test_cases/expected_kdl/hex_int.kdl @@ -1 +1 @@ -node 0xabcdef0123456789abcdef +node 207698809136909011942886895 diff --git a/tests/test_cases/expected_kdl/hex_int_underscores.kdl b/tests/test_cases/expected_kdl/hex_int_underscores.kdl index b18a9c3..78f3ce0 100644 --- a/tests/test_cases/expected_kdl/hex_int_underscores.kdl +++ b/tests/test_cases/expected_kdl/hex_int_underscores.kdl @@ -1 +1 @@ -node 0xabcdef0123 +node 737894400291 diff --git a/tests/test_cases/expected_kdl/hex_leading_zero.kdl b/tests/test_cases/expected_kdl/hex_leading_zero.kdl index c05ae7c..d20bd7d 100644 --- a/tests/test_cases/expected_kdl/hex_leading_zero.kdl +++ b/tests/test_cases/expected_kdl/hex_leading_zero.kdl @@ -1 +1 @@ -node 0x1 +node 1 diff --git a/tests/test_cases/expected_kdl/leading_zero_binary.kdl b/tests/test_cases/expected_kdl/leading_zero_binary.kdl index 2a38fed..d20bd7d 100644 --- a/tests/test_cases/expected_kdl/leading_zero_binary.kdl +++ b/tests/test_cases/expected_kdl/leading_zero_binary.kdl @@ -1 +1 @@ -node 0b1 +node 1 diff --git a/tests/test_cases/expected_kdl/leading_zero_oct.kdl b/tests/test_cases/expected_kdl/leading_zero_oct.kdl index 9585c83..d20bd7d 100644 --- a/tests/test_cases/expected_kdl/leading_zero_oct.kdl +++ b/tests/test_cases/expected_kdl/leading_zero_oct.kdl @@ -1 +1 @@ -node 0o1 +node 1 diff --git a/tests/test_cases/expected_kdl/octal.kdl b/tests/test_cases/expected_kdl/octal.kdl index 68bc955..225217b 100644 --- a/tests/test_cases/expected_kdl/octal.kdl +++ b/tests/test_cases/expected_kdl/octal.kdl @@ -1 +1 @@ -node 0o76543210 +node 16434824 diff --git a/tests/test_cases/expected_kdl/parse_all_arg_types.kdl b/tests/test_cases/expected_kdl/parse_all_arg_types.kdl index 3d1f3f7..2e8552c 100644 --- a/tests/test_cases/expected_kdl/parse_all_arg_types.kdl +++ b/tests/test_cases/expected_kdl/parse_all_arg_types.kdl @@ -1 +1 @@ -node 1 1.0 1.0E+10 1.0E-10 0x1 0o7 0b10 "arg" "arg\\\\" true false null +node 1 1.0 1.0E+10 1.0E-10 1 7 2 "arg" "arg\\\\" true false null diff --git a/tests/test_cases/expected_kdl/prop_hex_type.kdl~ b/tests/test_cases/expected_kdl/prop_hex_type.kdl~ deleted file mode 100644 index d819d6a..0000000 --- a/tests/test_cases/expected_kdl/prop_hex_type.kdl~ +++ /dev/null @@ -1 +0,0 @@ -node key=(type)0x10 diff --git a/tests/test_cases/expected_kdl/slashdash_node_in_child.kdl b/tests/test_cases/expected_kdl/slashdash_node_in_child.kdl index 56e0831..f50c4f2 100644 --- a/tests/test_cases/expected_kdl/slashdash_node_in_child.kdl +++ b/tests/test_cases/expected_kdl/slashdash_node_in_child.kdl @@ -1,2 +1 @@ -node1 { -} +node1 diff --git a/tests/test_cases/expected_kdl/slashdash_repeated_prop.kdl b/tests/test_cases/expected_kdl/slashdash_repeated_prop.kdl new file mode 100644 index 0000000..82c6972 --- /dev/null +++ b/tests/test_cases/expected_kdl/slashdash_repeated_prop.kdl @@ -0,0 +1 @@ +node arg="correct" diff --git a/tests/test_cases/expected_kdl/trailing_underscore_hex.kdl b/tests/test_cases/expected_kdl/trailing_underscore_hex.kdl index 5d6cf28..f426d4d 100644 --- a/tests/test_cases/expected_kdl/trailing_underscore_hex.kdl +++ b/tests/test_cases/expected_kdl/trailing_underscore_hex.kdl @@ -1 +1 @@ -node 0x123abc +node 1194684 diff --git a/tests/test_cases/expected_kdl/trailing_underscore_octal.kdl b/tests/test_cases/expected_kdl/trailing_underscore_octal.kdl index 0e653f9..9152a92 100644 --- a/tests/test_cases/expected_kdl/trailing_underscore_octal.kdl +++ b/tests/test_cases/expected_kdl/trailing_underscore_octal.kdl @@ -1 +1 @@ -node 0o123 +node 83 diff --git a/tests/test_cases/expected_kdl/underscore_in_fraction.kdl b/tests/test_cases/expected_kdl/underscore_in_fraction.kdl new file mode 100644 index 0000000..29bb938 --- /dev/null +++ b/tests/test_cases/expected_kdl/underscore_in_fraction.kdl @@ -0,0 +1 @@ +node 1.02 diff --git a/tests/test_cases/expected_kdl/underscore_in_octal.kdl b/tests/test_cases/expected_kdl/underscore_in_octal.kdl index 94f0c85..f4f6039 100644 --- a/tests/test_cases/expected_kdl/underscore_in_octal.kdl +++ b/tests/test_cases/expected_kdl/underscore_in_octal.kdl @@ -1 +1 @@ -node 0o1234567 +node 342391 diff --git a/tests/test_cases/input/slashdash_repeated_prop.kdl b/tests/test_cases/input/slashdash_repeated_prop.kdl new file mode 100644 index 0000000..b427175 --- /dev/null +++ b/tests/test_cases/input/slashdash_repeated_prop.kdl @@ -0,0 +1 @@ +node arg="correct" /- arg="wrong"