In the Intel SDM (Software Development Manuals) Vol. 2, the definition of "pop" instruction, in pseudo code is:
IF StackAddrSize = 32 THEN IF OperandSize = 32 THEN DEST ← SS:ESP; (* Copy a doubleword *) ESP ← ESP + 4; ELSE (* OperandSize = 16*) DEST ← SS:ESP; (* Copy a word *) ESP ← ESP + 2; FI; ELSE IF StackAddrSize = 64 THEN IF OperandSize = 64 THEN POP—Pop a Value from the Stack DEST ← SS:RSP; (* Copy quadword *) RSP ← RSP + 8; ELSE (* OperandSize = 16*) DEST ← SS:RSP; (* Copy a word *) RSP ← RSP + 2; FI; FI;
Lets say we have a top of the stack on address 0x7fffffffe6a0 and there is already values 1, 2, 3 and 4 on it.
So the layout is like that:
0x7fffffffe6a0: 1 0x7fffffffe6a8: 2 0x7fffffffe6b0: 3 0x7fffffffe6b8: 4
And some registers:
RSP = 0x7fffffffe6a0 RDI = ? RSI = ?
Now if we execute "pop rdi", we are going to write to RDI whats on top of the stack, in this case 1 and increment RSP by 8. Nothing new, this is all standard stuff and It is what SDM tells us.
So now we have:
0x7fffffffe6a8: 2 0x7fffffffe6b0: 3 0x7fffffffe6b8: 4
And registers:
RSP = 0x7fffffffe6a8 RDI = 1 RSI = ?
After executing "pop rsi", we enter this state:
0x7fffffffe6b0: 3 0x7fffffffe6b8: 4
And registers:
RSP = 0x7fffffffe6b0 RDI = 1 RSI = 2
Now to make things a little bit more interesting, let's execute pop with the destination of stack register itself. That is "pop rsp". What would be the state of stack and registers after we execute that? We first write value from the top of the stack to the destination, in this case rsp, so RSP = 3 and then increment RSP by 8, so RSP = 11.
Right?
Wrong.
And this is the surprising behavior from the title of this blog post. There is an additional note in the SDM saying what happens in the special case of executing "pop esp/rsp":
The POP ESP instruction increments the stack pointer (ESP) before data at the old top of stack is written into the destination.
So first we increment the RSP and then we write to the destination. But the destination is RSP itself, thus we overwrite just incremented value. This means "pop rsp" will essentially behave as "mov rsp, [rsp]".