๐ Creating a VM for fun - Part 1: ASM

To make things short, I saw How to write a virtual machine in order to hide your viruses and break your brain forever by @s01den published in tmp.out’s second edition. This new paper made me enjoy (once again) low-level. I wanted to know more about this abstract subject of “virtual machines” in reverse engineering, so I read it and started to implement my own VM in assembly!
You will find my code on my Github repo
Why assembly?
I wanted to understand everything I did during this process, and needed to stick with the lowest level I could, I will talk about the future of this project at the end of the post.
I was also already familiar with assembly, especially nasm for Linux, and wanted to test my knowledge.
Design
Before staring to type very fast on my keyboard, I needed to put things on a paper, in order to have a clear overview of the project.
I had to answer a few questions:
What is an instruction?
It’s like a function, or an alias to some code to execute
How does the CPU knows what to do with an instruction?
The code an instruction represents is written for the CPU, so it is seamless
How can I make custom instructions?
Just implemet some functions or code blocks, then map them to a custom “OPcode”
How can I make the CPU execute my custom instructions?
Make a simple condition on the custom OPcode, and execute the code mapped to it
PoC
Registers
To (re)set registers, code is very straightforward and don’t really need explainations, right?
|
|
Instructions
I decided to implement a very low amount of instructions, as I already plan to upgrade this project in the future. I only need a proof of concept before going big.
OPcode | Instruction | NASM |
---|---|---|
0x1 | mov a, b | mov rbx, rcx |
0x2 | push a | push rbx |
0x3 | add a, b | add rbx, rcx |
0x4 | jmp a | jmp rbx |
Yes, some very basic instructions.
Execution
The concept here is to compare rax
, our opcode register and then call the corresponding function:
|
|
Future
In a future post I will cover how to improve this VM, especially using a fully emulated virtual memory, using C. ๐