doc: resolve inconsistency in banned chars in escaped form (#450)

Also add it explicitly to the spec grammar to make it clear that escaped forms are also affected
This commit is contained in:
Evgeny 2024-12-24 14:57:10 +07:00 committed by GitHub
parent 646dafcd35
commit 76dc3e3002
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 7 additions and 2 deletions

View File

@ -819,7 +819,8 @@ considered _a single newline_.
### Disallowed Literal Code Points
The following code points may not appear literally anywhere in the document.
They may be represented in Strings (but not Raw Strings) using [Unicode Escapes](#escapes) (`\u{...}`).
They may be represented in Strings (but not Raw Strings) using [Unicode Escapes](#escapes) (`\u{...}`,
except for non Unicode Scalar Value, which can't be represented even as escapes).
* The codepoints `U+0000-0008` or the codepoints `U+000E-001F` (various
control characters).
@ -876,9 +877,13 @@ disallowed-keyword-identifiers := 'true' | 'false' | 'null' | 'inf' | '-inf' | '
quoted-string := '"' single-line-string-body '"' | '"""' newline multi-line-string-body newline (unicode-space | ws-escape)* '"""'
single-line-string-body := (string-character - newline)*
multi-line-string-body := (('"' | '""')? string-character)*
string-character := '\\' (["\\bfnrts] | 'u{' hex-digit{1, 6} '}') | ws-escape | [^\\"] - disallowed-literal-code-points
string-character := '\\' (["\\bfnrts] | 'u{' hex-unicode '}') | ws-escape | [^\\"] - disallowed-literal-code-points
ws-escape := '\\' (unicode-space | newline)+
hex-digit := [0-9a-fA-F]
hex-unicode := hex-digit{1, 6} - surrogates
surrogates := [dD][8-9a-fA-F]hex-digit{2}
// U+D800-DFFF: D 8 00
// D F FF
raw-string := '#' raw-string-quotes '#' | '#' raw-string '#'
raw-string-quotes := '"' single-line-raw-string-body '"' | '"""' newline multi-line-raw-string-body newline unicode-space* '"""'