If you've read the NablaVM post archive page you'll know that my NablaVM project's high level language 'Del' ran into a snag given some of my development decisions. Well, I've decided to uproot the Del language compiler and start from scratch. Armed with the knowledge of where my implementation broke down I think I have a good idea of what I need to do to be successful this time around.

Why so difficult?

The reason why this is a bit difficult is for a few reasons. Typically when you think "bytecode executable language", you might be inclined to think about the Monkey Language implemented in GoLang, or maybe you think about Python and the fact that it uses bytecode. While in a way, what I'm doing is related, it is significantly lower level usage of bytecode. Rather than using an instruction to represent actions of a something high level like the "LIST_APPEND" instruction in python, the bytecode I am working with is more closely related to raw machine instructions. The system I've developed has no idea what a 'list' is. Since this is much lower level, translating something like :

func main() -> nil {

    let an_int : int = 4; 

to executable bytecode requires all of the considerations of doing this in C++ when compiling to a specific hardware. Register usage, virtual memory management, and all of the things that are 'abstracted away' by the use of C in Python, or by GoLang in the Monkey programming language.

The Old Approach

When designing Del at first I did a "depth-first" approach to the implementation of the language. I would implement the grammar for a particular subset of the language, then the semantic analysis for that part, then the code generation, then the assembly. This meant that as I moved forward adding parts to the language, if I needed to update a particular analysis to incorporate the new functionality, I would then need to update every layer / stage implemented in every piece of the language that it touched from front to back. Since this was my very first time implementing a compiler, this happened a few times and it began to get quite tedious.

With the issues arising from the depth-first approach and the issues in my design decisions discussed here,  I knew I wouldn't be happy with the end result, so I thought it best to try a new approach.

The New Approach

Breadth first! For the new approach I intend on tackling the compiler in a breadth-first approach whereby I will define a set of grammar for the first implementation of the language. Once I am satisfied, I will develop the semantic analysis stage, then the code generation stage, and so on. A not-so-accurate representation of this might be this image :

Each color column can be seen as a stage in the approach. Each column really could be broken down into more columns like error reporting, memory management, etc, but for the sake of ease I put them into 4 overall categories.

The old approach ended up looking something like a bad neural network if we are using the colors to represent the time that something was done, and each circle/process box was a particular "thing" or "method" in the process pipeline:

It quickly became a mess of logic that is not something I want to be associated with. This happens often to people when implementing something new for the first time. Things that were assumed to not be an issue become an issue, a refactor occurs, and then things are stapled together, and on and on. If this were a funded multi-person project I suspect this wouldn't have happened.... well.. hopefully.

Now that I know more about what I'm doing I've created a branch on the compiler repo and have begun implementing the grammar I expect to have for Del 1.0. I suspect by the end of this weekend I will have all of the grammar fleshed out and ASTs ready for semantic analysis.