From 5845e31d6c46f4b3fccbd8305c658273f91a46c4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kat=20March=C3=A1n?= Date: Tue, 15 Dec 2020 22:45:21 -0800 Subject: [PATCH] bring back bom-as-ws --- SPEC.md | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/SPEC.md b/SPEC.md index 7fbda91..8c8d2ca 100644 --- a/SPEC.md +++ b/SPEC.md @@ -17,12 +17,8 @@ The toplevel concept of KDL is a Document. A Document is composed of zero or more [Nodes](#node), separated by newlines and whitespace, and eventually terminated by an EOF. -All KDL documents should: - -* Be UTF-8 encoded -* Ignore UTF-8 byte order marks ("BOM") anywhere in the file, even when it's - not the first set of bytes in a stream. -* Conform to the specifications in this document. +All KDL documents should be UTF-8 encoded and conform to the specifications in +this document. #### Example @@ -198,7 +194,9 @@ linespace := newline | ws | single-line-comment newline := `000D` | `000A` | `000D` `000A` | `0085` | `000C` | `2028` | `2029` -ws := unicode-space | multi-line-comment +ws := bom |unicode-space | multi-line-comment + +bom := `FFEF` unicode-space := See Table (All White_Space unicode characters which are not `newline`)