I rewrote my toy language interpreter in Rust
Im rewriting Grotsky (my toy programming language) in Rust, the previous implementation was done in Go. The goal of the rewrite is to improve my Rust skills, and to improve the performance of Grotsky, by at least 10x. This has been a serious of posts, this one is the latest one. Hopefully the best and most insightful of them all.
In previous posts:
I've outlined a plan to migrate Grotsky to a Rust based platform.Originally, my plan was very ambitious and I thought I would be able to finish the transition in like two months. In reality it took five months :-)
Performance improvement
I was aiming at a 10x improvement. In reality is not that much. I ran various benchmarks and get at most a 4x improvement. Which is not great, but also not that bad given that I know very little of how to do high performance Rust. The interpreter is written in the most dumbest and easiest way I managed to do it.
Let's look at some numbers for different programs.
Loop program
let start = io.clock() fn test() { let a = 1 while a < 10000000 { a = a + 1 } } test() io.println(io.clock() - start)
The result is:
trial #1 build/grotsky best 5.135s 3.877x time of best build/grotsky-rs best 1.325s 287.6800% faster trial #2 build/grotsky best 5.052s 3.814x time of best build/grotsky-rs best 1.325s 281.4182% faster trial #3 build/grotsky best 5.035s 3.802x time of best build/grotsky-rs best 1.325s 280.1663% faster trial #4 build/grotsky best 5.003s 3.777x time of best build/grotsky-rs best 1.325s 277.6831% faster trial #5 build/grotsky best 5.003s 3.777x time of best build/grotsky-rs best 1.325s 277.6831% faster
Recursive fibonacci
let start = io.clock() fn fib(n) { if n <= 2 { return n } return fib(n-2) + fib(n-1) } fib(28) io.println(io.clock() - start)
The result is:
trial #1 build/grotsky best 0.8409s 294.5155% faster build/grotsky-rs best 3.317s 3.945x time of best trial #2 build/grotsky best 0.8168s 271.3829% faster build/grotsky-rs best 3.033s 3.714x time of best trial #3 build/grotsky best 0.797s 245.6835% faster build/grotsky-rs best 2.755s 3.457x time of best trial #4 build/grotsky best 0.7784s 249.9964% faster build/grotsky-rs best 2.724s 3.5x time of best trial #5 build/grotsky best 0.7784s 249.9964% faster build/grotsky-rs best 2.724s 3.5x time of best
In this case is like 3.5x slower. This is due to function calls. Im not very well versed in Rust, so on each call im copying a lot of data over and over. In the go implementation everything is just pointers so there's less copying.
Compile to bytecode
With the Rust implementation, generating and compiling to bytecode was added. Now it's possible to generate a bytecode file to later read it. This is a way of distributing files without giving away source code and also a little bit more performant because you skip parsing and compilation phases.
How it works:
grotsky compile example.gr # Compile file grotsky example.grc # Run compiled file
Memory model
Grotsky is a reference-counted language. We're using Rust's Rc and RefCell to keep track of values.
pub struct MutValue<T>(pub Rc<RefCell<T>>); impl<T> MutValue<T> { pub fn new(obj: T) -> Self { MutValue::<T>(Rc::new(RefCell::new(obj))) } } pub enum Value { Class(MutValue<ClassValue>), Object(MutValue<ObjectValue>), Dict(MutValue<DictValue>), List(MutValue<ListValue>), Fn(MutValue<FnValue>), Native(NativeValue), Number(NumberValue), String(StringValue), Bytes(BytesValue), Bool(BoolValue), Slice(SliceValue), Nil, } pub enum Record { Val(Value), Ref(MutValue<Value>), } pub struct VM { pub activation_records: Vec<Record>, }
Most of the simple values are just stored as-is: Native (builtin functions), Number, String, Bytes, Bool, Slice and Nil.
For the other complex values we need to use 'pointers' which in this case are MutValue.
Then the Grotsky VM uses Records which can be a plain Value or a reference to a Value. The records are registers, each function has up to 255 registers. The reference to values are used to store upvalues. A register is turned into an upvalue when a variable is closed by another function.
This implementation ends up being very slow, but easy to manage. Because Rust stdlib does all the work.
Using Rust in this blogpost
As you may know, this blog is powered by grotsky. Im happy to say that I successfully migrated from grotsky to grostky-rs as the backend for the blog. And what you're reading now is generated by the latest implementation of the language using Rust.
Even for local development the Rust version is used. Which means Im using a TCP server and an HTTP implementation written in Grotsky.
Closing remarks
This has been a great learning, Im happy to have finished because it required a lot of effort. Im not gonna announce any new work on this interpreter but I would like to keep adding stuff. Improving it further to make it more performant and more usable.
In the end I encourage everyone to try it and also start their own project. Is always cool to see what everyone else is doing.
Thanks for reading.