diff --git a/docs/syntax.md b/docs/syntax.md new file mode 100644 index 0000000..f64dd31 --- /dev/null +++ b/docs/syntax.md @@ -0,0 +1,422 @@ +# NewLang syntax description + +This is a somewhat informal summary of the syntax for NewLang. +The intended audience for this document are developers. + +- Copyright: 2021 Jookia +- SPDX-License-Identifier: LGPL-2.1-or-later + +## Design Principles + +NewLang's syntax is intended to be simple, verbose and unambiguous. +This is intended to aid people with physical and cognitive disabilities +that have trouble with symbols and keeping track of large amounts of +contextual information. + +This section lists the conscious design decisions behind the syntax. + +### All syntax is textual + +Often programming languages will use symbols to represent tokens. +These are often borrowed mathematical and logical symbols. + +These suffer from numerous accessibility issues: + +- Screen readers skip over symbols +- Search engines ignore most symbols +- Symbols can be hard to see with limited vision + +To avoid these issues, NewLang uses text for all tokens. +Some additional conventions apply: + +- Each word in a token starts with a capital letter. + This aids screen readers when pronouncing the tokens +- Tokens are case sensitive to avoid ambiguity + +### Whitespace is not important + +Most developers use spaces and tabs to visually structure code. +This convention helps in reading and understanding code. + +Unfortunately screen readers skip over whitespace. +This makes it difficult to understand code structure, +and even more difficult to write. + +NewLang treats whitespace as a separator between tokens. +No whitespace is otherwise required. + +Whitespace may still be used to structure code, +but this is discouraged. + +### Nesting is limited + +Programming languages often contain ways to specify context about when +or why code will execute. + +Some examples of these are: + +- Functions +- If statements +- Switch statements +- Looping statements +- Function calls in place of variables + +These are often necessary to make a structured programming language and +facilitate organization of code. + +However, most programming languages will allow nesting these directives. + +For example: + +- Functions in functions +- If statements inside if statements +- Loops inside if statements inside loops +- Function calls as variables inside function calls as variables + +Excessive nesting can create a large amount of cognitive overhead, +so NewLang strictly limits nesting to tolerable levels. + +## Syntax Reference + +The following is a complete listing of all syntax in NewLang. + +This is not specified using formal grammar or language. +Any ambiguity is accidental. + +### Tokens + +Syntax in NewLang is formed using alphanumeric tokens separated by whitespace. + +Whitespace can be the following characters: + +- A space +- A tab +- A new line + +For example, the following code snippet: + +``` +Hi There EveryOne +How are you +``` + +Breaks down in to the following tokens: + +- Hi +- There +- EveryOne +- How +- are +- you + +### File structure + +NewLang supports storing code in files. + +- An optional shebang +- NewLang directive +- Directives + +Files must be UTF-8 encoded and use Unix new lines. + +#### Shebangs + +Unix and Unix-like systems allow turning text files in to regular executables. +This is done by placing a shebang at the start of a file. + +A shebang consists of a line containing: + +- A hash (#) +- An exclamation point (!) +- Optionally whitespace +- The full path to the interpreter +- Optionally whitespace followed by command line arguments + +Here's an example of a file with a shebang: + +``` +#! /bin/bash --verbose +echo Hi there! +``` + +NewLang will detect shebangs and skip them, instead leaving them for the +operating system to handle. + +#### NewLang directive + +The NewLang directive specifies the version of NewLang used to write the code. + +The NewLang directive consists of two tokens: + +- The literal text 'NewLang' +- The NewLang version number + +Here's an example of specifying version 0: + +``` +NewLang 0 +``` + +This directive is used by NewLang to preserve backwards compatibility with +code written for older versions of itself. + +#### Notes + +Notes allow you to write text intended for other humans to read. + +These are often referred to as comments in other programming languages. + +Notes consist of: + +- A token with the literal text 'StartNote' +- Arbitrary text +- A token with the literal text 'EndNote' + +'StartNote' is not allowed in the note's text. +This helps catch cases of accidental nesting. + +The text is otherwise ignored by NewLang. + +Notes may be used anywhere where code can appear. + +Here's an example of specifying a note among other code: + +``` +NewLang 0 +StartNote Read the user's name EndNote +Set Name To System Read EndSet +System Print Name Done +``` + +The note here is ignored by NewLang but read by humans. + +### Values + +Values represent a piece of data used by the running program. + +Values may be of these types: + +- Text +- A boolean +- A variable created earlier + +See the sections for each of these value types on how to use them. + +#### Text + +Text values contain human-readable text. + +Creating a text value consists of these tokens: + +- A token with the literal text 'StartText' +- Arbitrary text that must not contain 'StartText' +- A token with the literal text 'EndText' + +'StartText' is not allowed in text values. +This helps catch cases of accidental nesting. + +Whitespace after 'StartText' and before 'EndText' are automatically removed. + +The arbitrary text will be used as Text data. + +Here's an example: + +``` +NewLang 0 +System Print StartText Hello, world! EndText Done +``` + +This will print the text 'Hello, world!' + +#### Booleans + +Booleans are values that can be one of two possibilities: + +- True +- False + +Booleans can be created by specifying True or False as a token. +The value (True or False) will be used as Boolean data. + +These don't literally mean true or false, those are just the names +of each possibility. + +Anything that has two options can be represented using a boolean. +Instead of true or false you could think of these as: + +- Positive or negative +- Yes or no +- Allow or deny +- On or off +- Set or unset +- Light or dark + +Here's an example on using a boolean with an If directive: + +``` +NewLang 0 +If True +Then System Print StartText Hi EndText +Else System Print StartText Bye EndText +EndIf +``` + +This will print 'Hi' to the screen as the If directive interprets +the boolean as whether to run the first command or second command. + +#### Variables + +Variables reference values using a name. + +Specifying variables is done by writing the name as a token. + +Variables are created using the Set directive. + +Here's an example: + +``` +NewLang 0 +Set Time To StartText Twelve O'Clock EndText EndSet +System Print Time Done +``` + +This will create a variable named Time that references the text +value 'Twelve O'Clock'. + +When it's time to pass a value to System Print, the variable is used +as a reference instead of the actual value. + +As a result, this will print 'Twelve O'Clock' to the screen. + +### Actions + +Actions instruct NewLang to create a new value. + +An action consists of these tokens: + +- Subject (a Value) +- Optionally a Verb (a textual token) +- Optionally Arguments (one or more Values) + +An action performs a verb on a subject with provided arguments. + +The result of the action is a new value. +This value is used by other language directives. + +The number of arguments required is determined when running the verb. + +If no verb is provided, the subject is recycled as the result. + +Here's an example using a set directive: + +``` +NewLang 0 +Set UserName To System ReadLine EndSet +``` + +In this case the action is 'System ReadLine'. +This action refers to the subject 'System' and the verb 'ReadLine'. + +The result is used by the Set directive to create a variable. + +This variable would contain the value read from a command prompt. + +### Directives + +Directives instruct NewLang to perform some action or logic. + +The following directives are available: + +- Command +- Set +- If + +These are placed in a file in sequence. + +Here's an example of multiple directives in a single file: + +``` +NewLang 0 +Set UserName To System ReadLine EndSet +If UserName Equals StartText Jookia EndText +Then System Print StartText Hi Jookia EndText +Else System Print StartText Howdy Stranger EndText +EndIf +System Exit 0 Done +``` + +See the sections for each of these value directives on how to use them. + +#### Command Directives + +Command directives perform an action and discards the result. + +The directive consists of these tokens: + +- An action +- A token with the literal text 'Done' + +Here's an example of using command directives: + +``` +NewLang 0 +System Print StartText Hit enter to continue EndText Done +System ReadLine Done +``` + +This will prompt the user to press enter, +then read and discard a line. + +In this case the result of Print and ReadLine don't matter, +only the fact that the user had to hit enter to write a line does. + +#### Set Directives + +Set directives create a variable containing the result of an action. + +The directive consists of these tokens: + +- A token with the literal text 'Set' +- A token containing the name of the variable +- A token with the literal text 'To' +- An action +- A token with the literal text 'EndSet' + +The variable name must be unique. + +Here's an example of using the set directive: + +``` +NewLang 0 +Set Message To StartText Hi there! EndText +System Print Message Done +``` + +This creates a variable named 'Message' and prints it to the screen. + +#### If Directives + +If directives runs a test action then either a success or failure action +depending on the test action's result. + +The result of the success or failure action is discarded. + +The directive consists of these tokens: + +- A token with the literal text 'If' +- The test action +- A token with the literal text 'Then' +- The success action +- A token with the literal text 'Else' +- The failure action +- A token with the literal text 'EndIf' + +``` +NewLang 0 +If True +Then System Print StartText Success EndText +Else System Print StartText Failure EndText +EndIf +``` + +This will print 'Success' to the screen.