mirror of https://github.com/kdl-org/kdl.git
fix some confusion in grammar syntax, and actually specify the syntax itself (#351)
Fixes: https://github.com/kdl-org/kdl/issues/345
This commit is contained in:
parent
eb55930264
commit
99abeef6d3
|
|
@ -9,6 +9,7 @@
|
||||||
* All literal whitespace following a `\` in a string is now discarded.
|
* All literal whitespace following a `\` in a string is now discarded.
|
||||||
* Vertical tabs (`U+000B`) are now considered to be whitespace.
|
* Vertical tabs (`U+000B`) are now considered to be whitespace.
|
||||||
* Identifiers can't start with `r#`, so they're easy to distinguish from raw strings. (They already similarly can't start with a digit, or a sign+digit, so they're easy to distinguish from numbers.)
|
* Identifiers can't start with `r#`, so they're easy to distinguish from raw strings. (They already similarly can't start with a digit, or a sign+digit, so they're easy to distinguish from numbers.)
|
||||||
|
* The grammar syntax itself has been described, and some confusing definitions in the grammar have been fixed accordingly (mostly related to escaped characters).
|
||||||
|
|
||||||
### KQL
|
### KQL
|
||||||
|
|
||||||
|
|
|
||||||
28
SPEC.md
28
SPEC.md
|
|
@ -472,6 +472,10 @@ Note that for the purpose of new lines, CRLF is considered _a single newline_.
|
||||||
|
|
||||||
## Full Grammar
|
## Full Grammar
|
||||||
|
|
||||||
|
This is the full official grammar for KDL and should be considered
|
||||||
|
authoritative if something seems to disagree with the text above. The [grammar
|
||||||
|
language syntax](#grammar-language) is defined below.
|
||||||
|
|
||||||
```
|
```
|
||||||
nodes := (line-space* node)* line-space*
|
nodes := (line-space* node)* line-space*
|
||||||
|
|
||||||
|
|
@ -494,7 +498,7 @@ bare-identifier := (unambiguous-ident | numberish-ident | stringish-ident) - key
|
||||||
unambiguous-ident := (identifier-char - digit - sign - "r") identifier-char*
|
unambiguous-ident := (identifier-char - digit - sign - "r") identifier-char*
|
||||||
numberish-ident := sign ((identifier-char - digit) identifier-char*)?
|
numberish-ident := sign ((identifier-char - digit) identifier-char*)?
|
||||||
stringish-ident := "r" ((identifier-char - "#") identifier-char*)?
|
stringish-ident := "r" ((identifier-char - "#") identifier-char*)?
|
||||||
identifier-char := unicode - line-space - [\/(){}<>;[]=,"]
|
identifier-char := unicode - line-space - [\\/(){}<>;\[\]=,"]
|
||||||
keyword := boolean | 'null'
|
keyword := boolean | 'null'
|
||||||
prop := identifier '=' value
|
prop := identifier '=' value
|
||||||
value := type? (string | number | keyword)
|
value := type? (string | number | keyword)
|
||||||
|
|
@ -538,3 +542,25 @@ single-line-comment := '//' ^newline* (newline | eof)
|
||||||
multi-line-comment := '/*' commented-block
|
multi-line-comment := '/*' commented-block
|
||||||
commented-block := '*/' | (multi-line-comment | '*' | '/' | [^*/]+) commented-block
|
commented-block := '*/' | (multi-line-comment | '*' | '/' | [^*/]+) commented-block
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Grammar language
|
||||||
|
|
||||||
|
The grammar language syntax is a combination of ABNF with some regex spice thrown in.
|
||||||
|
Specifically:
|
||||||
|
|
||||||
|
* Single quotes (`'`) are used to denote literal text. `\` within a literal
|
||||||
|
string is used for escaping other single-quotes, for initiating unicode
|
||||||
|
characters using hex values (`\u{FEFF}`), and for escaping `\` itself
|
||||||
|
(`\\`).
|
||||||
|
* `*` is used for "zero or more", `+` is used for "one or more", and `?` is
|
||||||
|
used for "zero or one".
|
||||||
|
* `()` can be used to group matches that must be matched together.
|
||||||
|
* `a | b` means `a or b`, whichever matches first. If multipe items are before
|
||||||
|
a `|`, they are a single group. `a b c | d` is equivalent to `(a b c) | d`.
|
||||||
|
* `[]` are used for regex-style character matches, where any character between
|
||||||
|
the brackets will be a single match. `\` is used to escape `\`, `[`, and
|
||||||
|
`]`. They also support character ranges (`0-9`), and negation (`^`)
|
||||||
|
* `-` is used for "except for" or "minus" whatever follows it. For example, `a
|
||||||
|
- `'x'` means "any `a`, except something that matches the literal `'x'`".
|
||||||
|
* The prefix `^` means "something that does not match" whatever follows it.
|
||||||
|
For example, `^foo` means "must not match `foo`".
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue