syntax: Add functions #17

Open Jookia opened this issue on 25 May 2022 - 2 comments

@Jookia Jookia commented on 25 May 2022

I've been brainstorming for a while on how to add some structure to NewLang. So we can have nice things like control flow, loops, re-usable code blocks.

Most languages do this with functions and nested code blocks that loop. Aside from the obvious problem of nesting code being off limits for this project, loops in general are confusing as they rely on mutability and have the interpreter kind of jump around the code in weird ways.

Most languages seem to really like nesting blocks of code, but there are a few that do it differently:

  • Early BASIC variants
  • Forth
  • Assembly

Early BASIC and Assembly are kind of spaghetti, relying on jumping to different lines without much in the way of containing variables and state to stop them leaking out. Forth's design is interesting: You write very small functions that call other functions, and you write little nested loops in it. But loops still require mutation and jumping around in your head.


I wasn't satisfied with any of these solutions. After toying with some concepts and running them past Xogium and some others I've decided to steal some functional programming ideas instead.

The solution I've come up with is fairly simple: Functions take arguments and return values. Like most languages, functions can run other functions, set variables, do if statements, and return values. However in NewLang functions can do two other things.

The first thing NewLang can do that most can't is that functions can jump to other functions, like a goto. This is done by returning a function call. This is called tail call elmination in most languages. But with it we can implement loops like this:

StartFunction InfiniteLoop Do
System Print StartText Hello There EndText
Return InfiniteLoop
EndFunction

This function takes no arguments, prints "Hello There", then runs itself again.
This loops forever.

An example of using this for iteration is something like this:

StartFunction CountDown Number Do
If Number Equals 0 Then Return End
Set NextNumber To Number Minus 1 EndSet
System Print Number
Return CountDown NextNumber
end

This function would take a number, then call itself each time but with a smaller number until it reaches 0. Then it would return.

In most languages these would fail to run properly because each function would return back to the one that called it, but in this case return acts like a goto instead.


The next feature is creating functions as variables. In NewLang using SetFunction you can create a function at runtime that includes runtime data. In most languages this is called a lambda or closure. The win here is that we can then have functions that do looping for us so we never have to worry about it. For example:

BeginNote Runs a function when counting down EndNote
StartFunction DoTimes Number Function Do
If Number Equals 0 Then Return End
Set NextNumber To Number Minus 1 EndSet
Function
Return CountDown NextNumber Function
end

BeginNote Say someone's name many times EndNote
StartFunction SayNameManyTimes Do
System Print StartText Enter your name EndText
Set Name To System ReadLine EndSet
System Print StartText Enter times to say your name EndText
Set Times To System ReadNumber EndSet
SetFunction SayName To System Print Name EndFunction
DoTimes Times SayName
EndFunction

As you can see by modifying CountDown to take a function then do that instead of printing we've made a loop function that runs another function. Then after reading some data we made a function at runtime that prints a variable, then we pass that to DoTimes to do it.


I'm sure implementing this is going to be nice and painful, but right now the biggest headache are syntax changes.

The first issue is what should the syntax for declaring functions be? I like the form of 'StartFunction Name Arg1 Arg2 Do .... EndFunction' but something like 'Function Name WithArgs Arg1 Arg2 Do ... EndFunction' might work too.

The second issue is kind of a bigger one. Function calling in NewLang is currently 'Subject Verb Arguments', such as 'System Exit 0', or 'MyList Append Name'. These are kind of like non-mutable objects and do the same thing as closures: They pass some runtime data to some function (like MyList to Append) in a nice looking way. This is probably going to be used for namespacing and other stuff in the future.

Calling functions without a subject seems easy enough: Just check if Subject is a function, and if it is, skip the verb part. Or rather. So we have two forms: 'Subject Verb Arguments' and just 'Verb Arguments'. This is fine so far. But what about just Verb?

In the above example you see me writing 'Return InfiniteLoop' and this is ambiguous. Are we returning the function as a value, or calling it and returning that value (or goto it) with no arguments? In most languages this is done by having a specific notation for calling things.

Perhaps we could instead of having 'Verb Arguments' we have a 'Call' subject? Or some marker to indicate we're calling it. So we'd do 'Call InfiniteLoop' or 'Return Call InfiniteLoop'. It's a bit confusing since we don't need that for things like 'System Exit 0'.

It also could just be that 'Return' should be only used for values, with something like 'Jump' or 'Goto' for function calls.

@Jookia Jookia changed priority from default to high on 25 May 2022
@Jookia Jookia added the syntax label on 25 May 2022

Okay after getting feedback I have a few ideas about this. The first thing is about the function call syntax.

To recap: The issue with having a function call syntax of being either 'Subject Verb Args' or 'Verb Args' means that no args is just 'Verb', which is the same syntax as writing a variable.

One way to solve this is to shove 'Call' infront of it, but that's still kind of weird right?

The other way is to have special syntax just for handling functions and others for handling variables. So 'Return' only returns values, 'Goto' only takes functions. But that doesn't work when setting variables.

Going back to the 'Call' syntax, I think a good solution is to have functions just truly first-class like any other variable or object, and have it so you can run 'Call' on it. So 'MyFunction Call' or 'MyFunction Call Arg1 Arg2 Arg3' etc. In the future we could also have other verbs for functions for inspecting them too. This also means there's no separate syntax, everything is Subject Verb Args.

The second feedback is the confusion about returning function arguments. So here's some example code:

Return System ReadLine

In most languages this would have the code call System ReadLine then return from ReadLine back to here, collect the return value, then return that. So it has to remember to come back to the return statement here. This would cause a stack overflow if you used a loop or something.

In NewLang it optimizes this to 'don't return here, just go to ReadLine and it can return for us'. So a better example might be to have it as:

Goto System ReadLine

I kind of like this idea as it makes more sense. Return returns a value, Goto jumps to code.

However, there kind of a catch. What should this code below do?

Set Value To System ReadLine EndSet
Return Value

It sets a value then returns the value. This could be optimized to a Goto quite easily. Should it be? Or should Gotos be explicit?

I lean on the side of 'if we have Gotos, then we should only jump to functions when we use Gotos'. This would make it easier to understand and avoid guessing about returns and optimizations.

As an update, I think having a 'Jump' statement that passes control + the stack frame to another function is a good idea, with return returning control to the last known caller. I say Jump instead of Goto because after extensive polling and people saying either Goto or Jump is fine, one person I randomly talked to said Jump is better. I think it also works a bit better as goto is often used for jumping around within functions, not outside them. Now that I think about it, C has longjmp/setjmp for inter-function jumping, and that uses this terminology.

I also think optimizing to Jumps is probably not worth it given it will make the backtrace a bit more complicated.

@Jookia Jookia changed title from Add functions to syntax: Add functions on 31 May 2022
Labels

Priority
high
Milestone
No milestone
Assignee
No one assigned
1 participant
@Jookia