mirror of https://github.com/kdl-org/kdl.git
Merge branch 'kdl-v2' into patch-2
This commit is contained in:
commit
8ccbc92fed
|
|
@ -0,0 +1,19 @@
|
|||
# KDL Changelog
|
||||
|
||||
## 2.0.0 (2022-08-28)
|
||||
|
||||
### Grammar
|
||||
|
||||
* Solidus/Forward slash (`/`) is no longer an escaped character.
|
||||
* Single line comments (`//`) can now be immediately followed by a newline.
|
||||
* All literal whitespace following a `\` in a string is now discarded.
|
||||
* Vertical tabs (`U+000B`) are now considered to be whitespace.
|
||||
* Identifiers can't start with `r#`, so they're easy to distinguish from raw strings. (They already similarly can't start with a digit, or a sign+digit, so they're easy to distinguish from numbers.)
|
||||
|
||||
### KQL
|
||||
|
||||
* There's now a _required_ descendant selector (`>>`), instead of using plain
|
||||
spaces for that purpose.
|
||||
* The "any sibling" selector is now `++` instead of `~`, for consistency with
|
||||
the new descendant selector.
|
||||
* Map operators have been removed entirely.
|
||||
|
|
@ -5,20 +5,20 @@ documents to extract nodes and even specific data. It is loosely based on CSS
|
|||
selectors for familiarity and ease of use. Think of it as CSS Selectors or
|
||||
XPath, but for KDL!
|
||||
|
||||
This document describes KQL `1.0.0`. It was released on September 11, 2021.
|
||||
This document describes KQL `next`. It is unreleased.
|
||||
|
||||
## Selectors
|
||||
|
||||
Selectors use selection operators to filter nodes that will be returned by an
|
||||
API using KQL. The main differences between this and CSS selectors are the
|
||||
lack of `*` (use `[]` instead), and the specific syntax for
|
||||
lack of `*` (use `[]` instead), the specific syntax for descendants and siblings, and the specific syntax for
|
||||
[matchers](#matchers) (the stuff between `[` and `]`), which is similar, but not identical to CSS.
|
||||
|
||||
* `a > b`: Selects any `b` element that is a direct child of an `a` element.
|
||||
* `a b`: Selects any `b` element that is a _descendant_ of an `a` element.
|
||||
* `a b || a c`: Selects all `b` and `c` elements that are descendants of an `a` element. Any selector may be on either side of the `||`. Multiple `||` are supported.
|
||||
* `a >> b`: Selects any `b` element that is a _descendant_ of an `a` element.
|
||||
* `a >> b || a >> c`: Selects all `b` and `c` elements that are descendants of an `a` element. Any selector may be on either side of the `||`. Multiple `||` are supported.
|
||||
* `a + b`: Selects any `b` element that is placed immediately after a sibling `a` element.
|
||||
* `a ~ b`: Selects any `b` element that follows an `a` element as a sibling, either immediately or later.
|
||||
* `a ++ b`: Selects any `b` element that follows an `a` element as a sibling, either immediately or later.
|
||||
* `[accessor()]`: Selects any element, filtered by [an accessor](#accessors). (`accessor()` is a placeholder, not an actual accessor)
|
||||
* `a[accessor()]`: Selects any `a` element, filtered by an accessor.
|
||||
* `[]`: Selects any element.
|
||||
|
|
@ -69,33 +69,6 @@ is not one of those, the matcher will always fail:
|
|||
|
||||
* `[val() = (foo)]`: Selects any element whose tag is "foo".
|
||||
|
||||
## Map Operator
|
||||
|
||||
KQL implementations MAY support a "map operator", `=>`, that allows selection
|
||||
of specific parts of the selected notes, essentially "mapping" over a
|
||||
selector's result set.
|
||||
|
||||
Only a single map operator may be used, and it must be the last element in a
|
||||
selector string.
|
||||
|
||||
The map operator's right hand side is either an [`accessor`](#accessors) on
|
||||
its own, or a tuple of accessors, denoted by a comma-separated list wrapped in
|
||||
`()` (for example, `(a, b, c)`).
|
||||
|
||||
## Accessors
|
||||
|
||||
Accessors access/extract specific parts of a node. They are used with the [map
|
||||
operator](#map-operator), and have syntactic overlap with some
|
||||
[matchers](#matchers).
|
||||
|
||||
* `name()`: Returns the name of the node itself.
|
||||
* `val(2)`: Returns the third value in a node.
|
||||
* `val()`: Equivalent to `val(0)`.
|
||||
* `prop(foo)`: Returns the value of the property `foo` in the node.
|
||||
* `foo`: Equivalent to `prop(foo)`.
|
||||
* `props()`: Returns all properties of the node as an object.
|
||||
* `values()`: Returns all values of the node as an array.
|
||||
|
||||
## Examples
|
||||
|
||||
Given this document:
|
||||
|
|
@ -108,16 +81,16 @@ package {
|
|||
winapi "1.0.0" path="./crates/my-winapi-fork"
|
||||
}
|
||||
dependencies {
|
||||
miette "2.0.0" dev=true
|
||||
miette "2.0.0" dev=true integrity=(sri)"sha512-deadbeef"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Then the following queries are valid:
|
||||
|
||||
* `package name`
|
||||
* `package >> name`
|
||||
* -> fetches the `name` node itself
|
||||
* `top() > package name`
|
||||
* `top() > package >> name`
|
||||
* -> fetches the `name` node, guaranteeing that `package` is in the document root.
|
||||
* `dependencies`
|
||||
* -> deep-fetches both `dependencies` nodes
|
||||
|
|
@ -129,14 +102,20 @@ Then the following queries are valid:
|
|||
* -> fetches all direct-child nodes of any `dependencies` nodes in the
|
||||
document. In this case, it will fetch both `miette` and `winapi` nodes.
|
||||
|
||||
If using an API that supports the [map operator](#map-operator), the following
|
||||
are valid queries:
|
||||
## Full Grammar
|
||||
|
||||
* `package name => val()`
|
||||
* -> `["foo"]`.
|
||||
* `dependencies[platform] => platform`
|
||||
* -> `["windows"]`
|
||||
* `dependencies > [] => (name(), val(), path)`
|
||||
* -> `[("winapi", "1.0.0", "./crates/my-winapi-fork"), ("miette", "2.0.0", None)]`
|
||||
* `dependencies > [] => (name(), values(), props())`
|
||||
* -> `[("winapi", ["1.0.0"], {"platform": "windows"}), ("miette", ["2.0.0"], {"dev": true})]`
|
||||
For rules that are not defined in this grammar, see [the KDL grammar](https://github.com/kdl-org/kdl/blob/main/SPEC.md#full-grammar).
|
||||
|
||||
```
|
||||
query := selector q-ws* "||" q-ws* query | selector
|
||||
selector := filter q-ws* selector-operator q-ws* selector | filter
|
||||
selector-operator := ">>" | ">" | "++" | "+"
|
||||
filter := matcher+
|
||||
matcher := "top()"| "()" | identifier | type | accessor-matcher
|
||||
accessor-matcher := "[" (comparison | accessor)? "]"
|
||||
comparison := accessor q-ws* matcher-operator q-ws* (type | string | number | keyword)
|
||||
accessor := "val(" number ")" | "prop(" identifier ")" | "name()" | "tag()" | "values()" | "props()" | identifier
|
||||
matcher-operator := "=" | "!=" | ">" | "<" | ">=" | "<=" | "^=" | "$=" | "*="
|
||||
|
||||
q-ws := bom | unicode-space
|
||||
```
|
||||
|
|
|
|||
62
SPEC.md
62
SPEC.md
|
|
@ -3,9 +3,7 @@
|
|||
This is the semi-formal specification for KDL, including the intended data
|
||||
model and the grammar.
|
||||
|
||||
This document describes KDL version `2.0.0-preview`.
|
||||
|
||||
KDL version `1.0.0` was released on September 11, 2021.
|
||||
This document describes KDL version `1.0.0`. It was released on September 11, 2021.
|
||||
|
||||
## Introduction
|
||||
|
||||
|
|
@ -26,22 +24,6 @@ the directions if the data stream were only ASCII text. They do not refer
|
|||
to the writing direction of text, which can flow in either direction,
|
||||
depending on the characters used.
|
||||
|
||||
## Changes from version `1.0.0`
|
||||
|
||||
### Relaxed
|
||||
|
||||
- The way that `/-` comments are handled has changed. Now, `/-` comments are
|
||||
consistently treated like whitespace. Notably, this means that `/-` children
|
||||
blocks do not prevent the presence of later arguments, properties, or children
|
||||
blocks on the attached node.
|
||||
|
||||
### Constrained
|
||||
|
||||
- Previously, whitespace was not required before a children block, i.e. `node{}`
|
||||
was valid. Now, whitespace is required before a children block, the same as
|
||||
before arguments and properties.
|
||||
- `/-` comments on nodes must also be separated by plain (non-`/-`) whitespace.
|
||||
|
||||
## Components
|
||||
|
||||
### Document
|
||||
|
|
@ -327,6 +309,8 @@ String Value can encompass multiple lines without behaving like a Newline for
|
|||
|
||||
Strings _MUST_ be represented as UTF-8 values.
|
||||
|
||||
#### Escapes
|
||||
|
||||
In addition to literal code points, a number of "escapes" are supported.
|
||||
"Escapes" are the character `\` followed by another character, and are
|
||||
interpreted as described in the following table:
|
||||
|
|
@ -337,11 +321,39 @@ interpreted as described in the following table:
|
|||
| Carriage Return | `\r` | `U+000D` |
|
||||
| Character Tabulation (Tab) | `\t` | `U+0009` |
|
||||
| Reverse Solidus (Backslash) | `\\` | `U+005C` |
|
||||
| Solidus (Forwardslash) | `\/` | `U+002F` |
|
||||
| Quotation Mark (Double Quote) | `\"` | `U+0022` |
|
||||
| Backspace | `\b` | `U+0008` |
|
||||
| Form Feed | `\f` | `U+000C` |
|
||||
| Unicode Escape | `\u{(1-6 hex chars)}` | Code point described by hex characters, up to `10FFFF` |
|
||||
| Whitespace Escape | See below | N/A |
|
||||
|
||||
##### Escaped Whitespace
|
||||
|
||||
In addition to escaping individual characters, `\` can also escape whitespace.
|
||||
When a `\` is followed by one or more literal whitespace characters, the `\`
|
||||
and all of that whitespace are discarded. For example, `"Hello World"` and
|
||||
`"Hello \ World"` are semantically identical. See [whitespace](#whitespace)
|
||||
and [newlines](#newlines) for how whitespace is defined.
|
||||
|
||||
Note that only literal whitespace is escaped; *escaped* whitespace is retained.
|
||||
For example, these strings are all semantically identical:
|
||||
|
||||
```kdl
|
||||
"Hello\ \nWorld"
|
||||
|
||||
"Hello\n\
|
||||
World"
|
||||
|
||||
"Hello\nWorld"
|
||||
|
||||
"Hello
|
||||
World"
|
||||
```
|
||||
|
||||
##### Invalid escapes
|
||||
|
||||
Except as described in the escapes table, above, `\` *MUST NOT* precede any
|
||||
other characters in a string.
|
||||
|
||||
### Raw String
|
||||
|
||||
|
|
@ -415,6 +427,7 @@ space](https://www.unicode.org/Public/UCD/latest/ucd/PropList.txt):
|
|||
| Name | Code Pt |
|
||||
|----------------------|---------|
|
||||
| Character Tabulation | `U+0009` |
|
||||
| Line Tabulation | `U+000B` |
|
||||
| Space | `U+0020` |
|
||||
| No-Break Space | `U+00A0` |
|
||||
| Ogham Space Mark | `U+1680` |
|
||||
|
|
@ -477,7 +490,10 @@ node-children := '{' nodes '}'
|
|||
node-terminator := single-line-comment | newline | ';' | eof
|
||||
|
||||
identifier := string | bare-identifier
|
||||
bare-identifier := ((identifier-char - digit - sign) identifier-char* | sign ((identifier-char - digit) identifier-char*)?) - keyword
|
||||
bare-identifier := (unambiguous-ident | numberish-ident | stringish-ident) - keyword
|
||||
unambiguous-ident := (identifier-char - digit - sign - "r") identifier-char*
|
||||
numberish-ident := sign ((identifier-char - digit) identifier-char*)?
|
||||
stringish-ident := "r" ((identifier-char - "#") identifier-char*)?
|
||||
identifier-char := unicode - line-space - [\/(){}<>;[]=,"]
|
||||
keyword := boolean | 'null'
|
||||
prop := identifier '=' value
|
||||
|
|
@ -487,7 +503,7 @@ type := '(' identifier ')'
|
|||
string := raw-string | escaped-string
|
||||
escaped-string := '"' character* '"'
|
||||
character := '\' escape | [^\"]
|
||||
escape := ["\\/bfnrt] | 'u{' hex-digit{1, 6} '}'
|
||||
escape := ["\\bfnrt] | 'u{' hex-digit{1, 6} '}' | (unicode-space | newline)+
|
||||
hex-digit := [0-9a-fA-F]
|
||||
|
||||
raw-string := 'r' raw-string-hash
|
||||
|
|
@ -518,7 +534,7 @@ bom := '\u{FEFF}'
|
|||
|
||||
unicode-space := See Table (All White_Space unicode characters which are not `newline`)
|
||||
|
||||
single-line-comment := '//' ^newline+ (newline | eof)
|
||||
single-line-comment := '//' ^newline* (newline | eof)
|
||||
multi-line-comment := '/*' commented-block
|
||||
commented-block := '*/' | (multi-line-comment | '*' | '/' | [^*/]+) commented-block
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1 +1 @@
|
|||
node "\"\\/\b\f\n\r\t"
|
||||
node "\"\\\b\f\n\r\t"
|
||||
|
|
|
|||
|
|
@ -0,0 +1 @@
|
|||
node
|
||||
|
|
@ -0,0 +1 @@
|
|||
node "Hello\n\tWorld" "Hello\n\tWorld" "Hello\n\tWorld" "Hello\n\tWorld" "Hello\n\tWorld" "Hello\n\tWorld"
|
||||
|
|
@ -1 +1 @@
|
|||
node "\"\\\/\b\f\n\r\t"
|
||||
node "\"\\\b\f\n\r\t"
|
||||
|
|
|
|||
|
|
@ -0,0 +1,2 @@
|
|||
//
|
||||
node
|
||||
|
|
@ -0,0 +1,15 @@
|
|||
// All of these strings are the same
|
||||
node \
|
||||
"Hello\n\tWorld" \
|
||||
"Hello
|
||||
World" \
|
||||
"Hello\n\ \tWorld" \
|
||||
"Hello\n\
|
||||
\tWorld" \
|
||||
"Hello
|
||||
\ \tWorld" \
|
||||
"Hello\n\t\
|
||||
World"
|
||||
|
||||
// Note that this file deliberately mixes space and newline indentation for
|
||||
// test purposes
|
||||
|
|
@ -0,0 +1 @@
|
|||
node "\/"
|
||||
Loading…
Reference in New Issue