@MightyPork @freundTech I dunno I always get the feeling writing a compiler involves handling lots and lots of special cases

@cpsdqs Because I'm trying to create a language, that compiles to .mcfunction, a format used by minecraft datapacks.

It's basically just a list of minecraft commands in a file, that can be run with /function <NAME>.

I'm currently stuck on code generation for arithmetic expressions, as all of the resources I can find for that use a stack machine, however I can't think of a way to implement a stack with scoreboards, so I have to use an accumulator/register machine instead.

@freundTech wait what minecraft commands are now turing complete?

@cpsdqs I don't know. Maybe. Guess I'll see if/when I hit a problem I can't solve..

@freundTech I think you could totally have a stack machine using invisible armor stands or something

@cpsdqs Probably. But I want to use as little entities as possible to reduce lag. A pure scoreboard implementation would be better.
If that's not possible I have to do something with entities though.

@freundTech also how does control flow work? does it like generate multiple mcfunction files or something

@cpsdqs Yes. There can be multiple files, which can call each other using /function.

Also in 1.13 /execute has an if argument:

/execute if score ... run function ...

@cpsdqs There is a gamerule maxCommandChainLength, which is the maximum amount of commands that can be executed with one function call. It counts the commands in the function itself and all functions called from it.

It's 65536 by default, but can take any int value.

@cpsdqs Idea using 1 entity per function call:
When a function is called it spawns an entity with a tag identifying it as a stack frame. Before that it increments a scoreboard value for all entities with that tag.
Then it uses the entity with the lowest number to store it's variables and registers.
Before returning it kills it's entity and decrements the value for all others.

Now the number of registers needed in a function should be constant.

@freundTech hm does an entity need to exist for it to be in a scoreboard? maybe you could just use one as a sort of stack pointer

@cpsdqs It doesn't need to exist to be in the scoreboard, but entities that don't exist can only be addressed with a fixed name. Not with selectors.

Here I would be using a selector to select the entity with the lowest value. For this it needs to exist.

@freundTech oh wait are they identified by their UUID in their scoreboard or something?

@cpsdqs Yes. But if you use an entity that doesn't exist you can just use any name in the scoreboard.

@freundTech oh no I was just wondering why the whole stack needs to be kept around

@cpsdqs Recursive function calls.
A function that doesn't call other functions can be implemented with a constant number of variables (scoreboard scores).
Once the function calls itself if will need twice the amount of variables, as it's still using the first set.
I need a way to store those local variables.

@cpsdqs successfully compiled "test2 = test + 1" to scoreboard commands!

@freundTech wow
that’s more than my ruby vm could do after a day of writing stuff

@cpsdqs Well to be fair: I've been writing on it for 2 days now.

Also I should clean up my code. It's already starting to get messy...

@cpsdqs python... Maybe not the fastest language, but fast to write. Also lark-parser is a great python library for parsing context-free grammars.

@freundTech I have no idea what that is but apparently I will this Friday

@cpsdqs There are different types of languages, which require different automata to parse.

The most simple are regular languages, which can be described using regular expressions and parsed using a finite state automaton.

Then there are context-free languages, describes by context-free grammars. Most programming languages are context-free languages. They can't be described by regular expressions.

Then there context-sensitive and recursively enumerable languages.

@freundTech sometimes I wonder how they spend an hour explaining something that fits into three lines

@cpsdqs There's a lot more to explain about that. For example how those grammars work and how to do the actual parsing.

For example regular expressions are defined as such:

∅ (empty set) is a regular expression
𝜀 (empty word) is a regular expression
All a ∈ Σ (set of terminals/letters) are regular expressions
If α and β are regexs then αβ, (α|β) and (α)* are also regexs.

Every regex can be describes using only those rules. Modern languages just add stuff to make writing them easier.

@cpsdqs @freundTech then there's nonsense languages i invented for my many dumb projects and they're probably something in between

@MightyPork @freundTech this all feels like, discovering that there’s an actual correct way of doing stuff you just botched together for years

@cpsdqs context-free grammars are a bit more complicated.

You have a start symbol (S), a set of terminals (T), a set of non-terminals (N) and a set of productions (P).

For example for describing all terms with correct parenthesis you would set T = {𝜀,(,)}, N = {TERM} and P = {
S ⇒ TERM
TERM ⇒ (TERM) | TERM . TERM | 𝜀
}

For example (())() matches that grammar, while (() and )( don't

It is produces by
S ⇒ TERM ⇒ TERM.TERM ⇒ (TERM)TERM ⇒ ((TERM))TERM ⇒ (())TERM ⇒ (())(TERM) ⇒ (())()

@freundTech oh wait I mixed up this toot and the one about matching parens

Sign in to participate in the conversation
Mastodon

Generalistic and moderated instance. All opinions are welcome, but hate speeches are prohibited. Users who don't respect rules will be silenced or suspended, depending on the violation severity.