Newer
Older
NewLang / docs / syntax.md

NewLang syntax description

This is a somewhat informal summary of the syntax for NewLang. The intended audience for this document are developers.

Currently this document does not match the language implementation, but instead describes the planned syntax.

Design Principles

NewLang's syntax is intended to be simple, verbose and unambiguous. This is intended to aid people with physical and cognitive disabilities that have trouble with symbols and keeping track of large amounts of contextual information.

This section lists the conscious design decisions behind the syntax.

All syntax is textual

Often programming languages will use symbols to represent tokens. These are often borrowed mathematical and logical symbols.

These suffer from numerous accessibility issues:

  • Screen readers skip over symbols
  • Search engines ignore most symbols
  • Symbols can be hard to see with limited vision

To avoid these issues, NewLang uses text for all tokens. Some additional conventions apply:

  • Each word in a token starts with a capital letter. This aids screen readers when pronouncing the tokens
  • Tokens are case sensitive to avoid ambiguity

Whitespace is not important

Most developers use spaces and tabs to visually structure code. This convention helps in reading and understanding code.

Unfortunately screen readers skip over whitespace. This makes it difficult to understand code structure, and even more difficult to write.

NewLang treats whitespace as a separator between tokens. No whitespace is otherwise required.

Whitespace may still be used to structure code, but this is discouraged.

Nesting is limited

Programming languages often contain ways to specify context about when or why code will execute.

Some examples of these are:

  • Functions
  • If statements
  • Switch statements
  • Looping statements
  • Function calls in place of variables

These are often necessary to make a structured programming language and facilitate organization of code.

However, most programming languages will allow nesting these directives.

For example:

  • Functions in functions
  • If statements inside if statements
  • Loops inside if statements inside loops
  • Function calls as variables inside function calls as variables

Excessive nesting can create a large amount of cognitive overhead, so NewLang strictly limits nesting to tolerable levels.

Syntax Reference

The following is a complete listing of all syntax in NewLang.

This is not specified using formal grammar or language. Any ambiguity is accidental.

Whitespace

Spaces are the following code points:

  • U+0009 HORIZONTAL TAB
  • U+0020 SPACE

New lines are the following code point sequences:

  • U+000A LINE FEED
  • U+000B VERTICAL TAB
  • U+000C FORM FEED
  • U+000D CARRIAGE RETURN
  • U+000A U+000D CARRIAGE RETURN then FORM FEED
  • U+0085 NEXT LINE
  • U+2028 LINE SEPARATOR
  • U+2029 PARAGRAPH SEPARATOR

Both spaces and new lines are treated as whitespace.

Tokens

Syntax in NewLang is formed using alphanumeric tokens separated by whitespace.

For example, the following code snippet:

Hi There EveryOne
How are you

Breaks down in to the following tokens:

  • Hi
  • There
  • EveryOne
  • How
  • are
  • you

File structure

NewLang supports storing code in files.

It contains:

  • Directives

Files must be UTF-8 encoded. Normalization is not required.

Notes

Notes allow you to write text intended for other humans to read.

These are often referred to as comments in other programming languages.

Notes consist of:

  • A token with the literal text 'StartNote'
  • Arbitrary text
  • A token with the literal text 'EndNote'

'StartNote' is not allowed in the note's text. This helps catch cases of accidental nesting.

The text is otherwise ignored by NewLang.

Notes may be used anywhere where code can appear.

Here's an example of specifying a note among other code:

StartNote Read the user's name EndNote
Set Name To System Read EndSet
System Print Name Done

The note here is ignored by NewLang but read by humans.

Values

Values represent a piece of data used by the running program.

Values may be of these types:

  • Text
  • A boolean
  • A variable created earlier

See the sections for each of these value types on how to use them.

Text

Text values contain human-readable text.

Creating a text value consists of these tokens:

  • A token with the literal text 'StartText'
  • Arbitrary text that must not contain 'StartText'
  • A token with the literal text 'EndText'

'StartText' is not allowed in text values. This helps catch cases of accidental nesting.

Whitespace is not preserved in the arbitrary text, instead each word is joined with a U+0020 SPACE code point.

The arbitrary text will be used as Text data.

Here's an example:

System Print StartText  Hello,  world! EndText Done

This will print the text 'Hello, world!'

Booleans

Booleans are values that can be one of two possibilities:

  • True
  • False

Booleans can be created by specifying True or False as a token. The value (True or False) will be used as Boolean data.

These don't literally mean true or false, those are just the names of each possibility.

Anything that has two options can be represented using a boolean. Instead of true or false you could think of these as:

  • Positive or negative
  • Yes or no
  • Allow or deny
  • On or off
  • Set or unset
  • Light or dark

Here's an example on using a boolean with an If directive:

If True
Then System Print StartText Hi EndText
Else System Print StartText Bye EndText
EndIf

This will print 'Hi' to the screen as the If directive interprets the boolean as whether to run the first command or second command.

Variables

Variables reference values using a name.

Specifying variables is done by writing the name as a token.

Variables are created using the Set directive.

Here's an example:

Set Time To StartText Twelve O'Clock EndText EndSet
System Print Time Done

This will create a variable named Time that references the text value 'Twelve O'Clock'.

When it's time to pass a value to System Print, the variable is used as a reference instead of the actual value.

As a result, this will print 'Twelve O'Clock' to the screen.

Actions

Actions instruct NewLang to create a new value.

An action consists of these tokens:

  • Subject (a Value)
  • Optionally a Verb (a textual token)
  • Optionally Arguments (one or more Values)

An action performs a verb on a subject with provided arguments.

The result of the action is a new value. This value is used by other language directives.

The number of arguments required is determined when running the verb.

If no verb is provided, the subject is recycled as the result.

Here's an example using a set directive:

Set UserName To System ReadLine EndSet

In this case the action is 'System ReadLine'. This action refers to the subject 'System' and the verb 'ReadLine'.

The result is used by the Set directive to create a variable.

This variable would contain the value read from a command prompt.

Directives

Directives instruct NewLang to perform some action or logic.

The following directives are available:

  • Command
  • Set
  • If

These are placed in a file in sequence.

Here's an example of multiple directives in a single file:

Set UserName To System ReadLine EndSet
If UserName Equals StartText Jookia EndText
Then System Print StartText Hi Jookia EndText
Else System Print StartText Howdy Stranger EndText
EndIf
System Exit 0 Done

See the sections for each of these value directives on how to use them.

Command Directives

Command directives perform an action and discards the result.

The directive consists of these tokens:

  • An action
  • A token with the literal text 'Done'

Here's an example of using command directives:

System Print StartText Hit enter to continue EndText Done
System ReadLine Done

This will prompt the user to press enter, then read and discard a line.

In this case the result of Print and ReadLine don't matter, only the fact that the user had to hit enter to write a line does.

Set Directives

Set directives create a variable containing the result of an action.

The directive consists of these tokens:

  • A token with the literal text 'Set'
  • A token containing the name of the variable
  • A token with the literal text 'To'
  • An action
  • A token with the literal text 'EndSet'

The variable name must be unique.

Here's an example of using the set directive:

Set Message To StartText Hi there! EndText
System Print Message Done

This creates a variable named 'Message' and prints it to the screen.

If Directives

If directives runs a test action then either a success or failure action depending on the test action's result.

The result of the success or failure action is discarded.

The directive consists of these tokens:

  • A token with the literal text 'If'
  • The test action
  • A token with the literal text 'Then'
  • The success action
  • A token with the literal text 'Else'
  • The failure action
  • A token with the literal text 'EndIf'
If True
Then System Print StartText Success EndText
Else System Print StartText Failure EndText
EndIf

This will print 'Success' to the screen.