mirror of https://github.com/kdl-org/kdl.git
More explicitly use and reference a cut point, rather than infallibility.
This commit is contained in:
parent
b5e8aaf035
commit
63090831cd
22
SPEC.md
22
SPEC.md
|
|
@ -855,7 +855,7 @@ value := type? node-space* (string | number | keyword)
|
|||
type := '(' node-space* string node-space* ')'
|
||||
|
||||
// Strings
|
||||
string := identifier-string | quoted-string | raw-string
|
||||
string := identifier-string | quoted-string | raw-string ¶
|
||||
|
||||
identifier-string := unambiguous-ident | signed-ident | dotted-ident
|
||||
unambiguous-ident := ((identifier-char - digit - sign - '.') identifier-char*) - disallowed-keyword-strings
|
||||
|
|
@ -927,14 +927,20 @@ Specifically:
|
|||
characters using hex values (`\u{FEFF}`), and for escaping `\` itself
|
||||
(`\\`).
|
||||
* `*` is used for "zero or more", `+` is used for "one or more", and `?` is
|
||||
used for "zero or one".
|
||||
* `*?` (used only in raw strings) indicates a *non-greedy* match.
|
||||
It also indicates *infallibility*, with a scope of the `string` production:
|
||||
once it successfully matches enough characters to satisfy that production
|
||||
the first time, it is not allowed to backtrack and continue matching further,
|
||||
even if that results in a parse failure.
|
||||
used for "zero or one". Per standard regex semantics, `*` and `+` are *greedy*;
|
||||
they match as many instances as possible without failing the match.
|
||||
* `*?` (used only in raw strings) indicates a *non-greedy* match;
|
||||
it matches as *few* instances as possible without failing the match.
|
||||
* `¶` is a *cut point*. It always matches and consumes no characters,
|
||||
but once matched, the parser is not allowed to backtrack past that point in the source.
|
||||
If a parser would rewind past the cut point, it must instead fail the overall parse,
|
||||
as if it had run out of options.
|
||||
(This is only used with the `raw-string` production,
|
||||
to ensure the first instance of the appropriate closing quote sequence
|
||||
is guaranteed to be the end of the raw string,
|
||||
rather than allowing it to potentially consume more of the document unexpectedly.)
|
||||
* `()` can be used to group matches that must be matched together.
|
||||
* `a | b` means `a or b`, whichever matches first. If multipe items are before
|
||||
* `a | b` means `a or b`, whichever matches first. If multiple items are before
|
||||
a `|`, they are a single group. `a b c | d` is equivalent to `(a b c) | d`.
|
||||
* `[]` are used for regex-style character matches, where any character between
|
||||
the brackets will be a single match. `\` is used to escape `\`, `[`, and
|
||||
|
|
|
|||
Loading…
Reference in New Issue