Table of Contents
So recently a coworker of mine showed a book about assembly poems, at first I was like “well this joke is very funny” and then I read the first page. 5 minutes later I ordered the same book for my own, and now that it has been received, I want to write my notes here.
I will try to understand each 64 pages of this awesome work, in order to maintain my assembly knowledge and to learn new tricks.
Of course I can be wrong in my interpretations of certain code or instructions, if so, feel free to help me improve!
1xor eax,eax 2lea rbx, 3loop $ 4mov rdx,0 5and esi,0 6sub edi,edi 7push 0 8pop rbp
The first page is pretty straightforward. As you can see every line sets one register to
Every instruction here is very common and easy to understand, but one thing caught my attention:
loop $. So let’s dig a bit here.
loop <label> can be resumed like so:
1loop: 2 ; if cx == 0 jmp to end 3 test cx, cx 4 jnz $+3 5 6 dec cx 7 jmp loop
As you can see, it decrements
cx at each iteration until it is
0. So a
loop $ would simply set
0, as the label to loop through is
$, the symbol for the actual address.
loop label you would have some code between
label and the loop instruction, but here there is none, so the loop only decrements
Pretty sneaky trick.
1.loop: 2 xadd rax,rdx 3 loop .loop
This one is a little trick and pretty obfuscated but was very quick to understand. In fact, this loop produces the fibonacci sequence for the first
As we learnt about
loop at 0x00, it loops until
0, and here there is one instruction which gets exectued within the loop.
xadd is quite uncommon, this instruction can be coded as:
1xadd: 2 ; swap rax and rdx 3 xchg rax, rdx 4 5 ; rdx = rdx + rax 6 mov r8, rdx 7 add r8, rax 8 mov rdx, r8
One experienced programmer can spot the algorithm to compute the fibonacci sequence here, which calculate the sum of the two previous elements of the sequence.
1neg rax 2sbb rax,rax 3neg rax
This code was a bit tricky, after discussing with friends (xThaz & SoEaSy) we agreed to say that this code tells if
rax is different than
At first I said that it tells the sign of
rax, but I missunderstood the
sbb instruction, but I was wrong and overlooked the computing of the
sbb rax, rax.
This code requires to look at
neg at first, despite of switching the sign of the target, it also sets the carry flag
CF accordingly with the sign.
We can now dig
sbb, its code can be:
1sbb: 2 ; rax is argv 3 ; rbx is argv 4 mov r8, rax 5 add r8, CF 6 sub rbx, r8
By looking at the possible output values, we van see that it is a substraction by either
rax + 0 or
rax + 1 of
rbx. Given that in this code, we have
sbb rax, rax, we can sse that we will only get
rax = rax - (rax + CF), with
1, thus the output of the instruction is either
1, corresponding to the initial sign of
1sub rdx,rax 2sbb rcx,rcx 3and rcx,rdx 4add rax,rcx
With the previous experiences we now know about all the above instructions, we can deduce the behaviour of the code, which can be coded in a more high-level language like Python:
1def my_func(rax, rdx): 2 if rax > rdx: 3 rax += rdx
Yes, this is that simple. The tricky part comes from the
sub rdx, raxstores the difference in
rdx, and sets the
sbb rcx, rcxsets
and rcx, rdx
raxwas lower than
rdxat line 1:
0and then the
andsill still set
raxwas greater than
rdxat line 1:
rcxbecomes the value of
add rax, rcxis trivial now
rax > rdx.
This code was pretty simple to test, I was unsure of the exact behaviour of it, so I tested it:
1for x in range(255): 2 print(chr(x), chr(x ^ 0x20))
And as you can see by executing this code, the printable characters have their case swapped: lowercase letters become upperccase, and vice-versa.
1sub rax,5 2cmp rax,4
This code is so tiny that it seems complex, but fear not, I think I have the solution for it. This code shows the similarities and difference beteween
sub rax, 5actually overwrites the value of
cmp rax, 4computes
sub rax, 4and only stores the result in the flags,
raxis not modified
1not rax 2inc rax 3neg rax
This code teaches the trick for those instructions:
not raxdoes the two’s-complement for the value ->
not 0x5 = 0xfffa
inc rax; neg raxdoes the same ->
neg 0x6 = 0xfffa
1inc rax 2neg rax 3inc rax 4neg rax
This code shows us that the
inc; neg is symetric: doing it twice leads us to the original value. Which is pretty logic as
inc; neg is the same as
1add rax,rdx 2rcr rax,1
rcr is not quite common, it rotates the bits to the right.
In fact, there is stuff hapenning with the
CF but I will skip this part as it is too long.
1shr rax,3 2adc rax,0
shr is a right shift of bits, with storing into
CF for overflowing last bit. This does the same as dividing by
shr rax, n.
add with the
CF in addition:
1; rax is argv 2; rbx is argv 3adc: 4 add rax, rbx 5 add rax, CF
So here the code divides
8 (2^3) and if the 3rd bit was 1, it increases
rax, we can rewrite this code in Python to make things clearer:
1def nine(a): 2 cf = a & 0b100 3 result = a / 3 4 result += cf 5 return result
1 add byte [rdi],1 2.loop: 3 inc rdi 4 adc byte [rdi],0 5 loop .loop
Now it gets a bit interesting.
add byte [rdi], 1 increases the value of what is pointed by the address at
inc rdi adds
rdi without modifing the
adc byte [rdi], 0 writes the content of
CF at the address pointed by
loop .loop ->
dec cx; jmp short .loop.
My guess here is that it sets every bytes in memory (at the address pointed by
rdi) to the content of
CF. The size of the memory is
rcx, and the first byte is set to
1 in order to identify the chunck (like a start bit in telecommunications).
1not rdx 2neg rax 3sbb rdx,-1
Here I got stuck for at least one hour, and because we had already seen all instructions, I did not bothered re-reading the documentation. It is pointless to read something twice, right?
So after calling the joker “coworker help”, we saw that
neg does not only the 2-complement of the register, but also sets
1 if the register is not
0. Remember to always double check what you read, especially documentation.
So this code invert all bits of
rdx, and if
rax is not
0, it will decrement
1mov rcx,rax 2xor rcx,rbx 3ror rcx,0xd 4 5ror rax,0xd 6ror rbx,0xd 7xor rax,rbx 8 9cmp rax,rcx
Despite the number of lines, this code was very easy to understand, in fact I was even able to guess (CTF player strenght here) the behaviour of it within a few seconds.
The code acts in two steps, and demonstrates that a
xor is the same as a
The first 3 lines copies
xor rcx, rbx, and then does a
ror rcx, 0xd. The other next 3 lines does the opposite:
ror rax, 0xd (and also
ror rbx, 0xd) and
xor rax, rbx.
Finally we see that both
rcx have the same value. Proving that
ror;xor is commutative.
1mov rdx,rbx 2 3xor rbx,rcx 4and rbx,rax 5 6and rdx,rax 7and rax,rcx 8xor rax,rdx 9 10cmp rax,rbx
This does the same as 0x0c but for
xor; and &
and; xor. Nothing really outstanding.
1mov rcx,rax 2and rcx,rbx 3not rcx 4 5not rax 6not rbx 7or rax,rbx 8 9cmp rax,rcx
Same goes for
and; not &
1.loop: 2 xor byte [rsi],al 3 lodsb 4 loop .loop