From 0092ad84dbe1906be27bec3f746a510741ee96a9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kat=20March=C3=A1n?= Date: Mon, 14 Dec 2020 22:49:27 -0800 Subject: [PATCH] wip long-form spec stuff --- SPEC.md | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 84 insertions(+) diff --git a/SPEC.md b/SPEC.md index a6f8491..c1f3cd8 100644 --- a/SPEC.md +++ b/SPEC.md @@ -3,6 +3,90 @@ This is the kinda-formal specification for KDL, including the intended data model and the grammar. +## Introduction + +KDL is a node-oriented document language. Its niche and purpose overlaps with +XML, and as do many of its semantics. You can use KDL both as a configuration +language, and a data exchange or storage format, if you so choose. + +## Components + +### Document + +The toplevel concept of KDL is a Document. A Document is composed of one or more +[Nodes](#node), separated by newlines and whitespace, and eventually terminated by an EOF. + +#### Example + +The following is a document composed of two toplevel nodes: + +```kdl +foo { + bar +} +baz +``` + +### Node + +Being a node-oriented language means that the real core component of any KDL +document is the "node". Every node must have a name, which is either a legal +[Identifier](#identifier), or a quoted [String](#string). + +Following the name are one or more [Whitespace](#whitespace) components, +followed by zero or more whitespace-separated [Values](#value) or +[Properties](#property). Finally, a node is terminated by either a +[Newline](#newline), a [Children Block](#children-block), a semicolon (`;`) or +the end of the +file/stream (an `EOF`). + +When present in the list of Properties and Values, plain Values (those not +attached to a Property), each "anonymous" value should be treated as a +Property whose key is its current index among _anonymous values_ in the same +node, starting from 0, as a string. Named properties do not count towarrds +this index. + +That is, the following two nodes are semantically equivalent: + +```kdl +foo 1 key="val" 2 +foo "0"=1 "1"=2 key="val" +``` + +#### Example + +```kdl +foo 1 key="val" 3 { + bar + baz +} +``` + +### Identifier + +A bare Identifier is composed of any unicode codepoint other than [non-initial +characters](#non-inidital-characters), followed by any number of unicode +codepoints other than [non-identifier characters](#non-identifier-characters). +Identifiers are terminated by [Whitespace](#whitespace) or +[Newlines](#newline). + +### Non-initial characters + +The following characters cannot be the first character in a bare +[Identifier](#identifier): + +* Any of "/\\{};[]=," +* Any decimal digit (0-9) +* Any [non-identifier characters](#non-identifier-characters) + +### Non-identifier characters + +The following characters cannot be used anywhere in a bare [Identifier](#identifier): + +* Any codepoint with hexadecimal value `0x20` or below. +* Any codepoint with hexadecimal value higher than `0x10FFF`. +* Any of "\\{};[]=," + ## Full Grammar ```