When I open a malware sample in Ghidra/IDA after not doing assembly for a while, I felt like, I lost touch with my basics.
So I’m rebuilding from the bottom: 8086-era 16-bit regs, then the 32-bit IA-32 expansions, and finally x86-64. What I want at the end is simple: when I see mov, jmp, and a call, I want to kno what exactly is happening rather than high level interpretation.
The 8086 mental model: 8 “general” registers
The 8086 gives you eight 16-bit registers that show up everywhere:
AX, BX, CX, DX
SI, DI, BP, SP
A key detail I tend to forget until I trip over it again: AX/BX/CX/DX can be accessed as 8-bit halves (AH/AL, etc.), while SI/DI/BP/SP are 16-bit only in the original 8086 model.
AX: accumulator, commonly used in arithmetic and I/O patterns
BX: historically “base” for addressing (you’ll see it used as a base pointer in some styles)
CX: “count” register shows up in loops and repeat/string ops
DX: often paired with AX for wider results (mul/div patterns)
SP/BP: stack pointer and base pointer energy (stack frame / locals / args)
SI/DI: “source” / “destination” index vibes, especially around string/memory ops
This isn't common with modern compilers and CPUs, but I still find them helpful as a starting point.
16-bit → 32-bit (IA-32)
When the 80386/IA-32 era arrives, the big change is: those same registers become 32-bit, and the naming gets an E prefix:
EAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP
What clicked for me is that the old 16-bit names don’t disappear — they become the low 16 bits of the 32-bit register, and the 8-bit pieces still exist where they existed before. The classic lists show this explicitly (eax/ecx/… and ax/cx/… and al/ah/etc.).
EAX (32)
└── AX (16)
├── AH (8 high)
└── AL (8 low)

A Graphical representation
32-bit → 64-bit (x86-64)
In x86-64, everything expands again:
RAX, RBX, RCX, RDX, RSI, RDI, RBP, RSP, plus the significant upgrade: R8–R15, and each of these has smaller names.
Example with RAX:
RAX = 64-bit
EAX = low 32-bit
AX = low 16-bit
AL/AH = low/high 8-bit pieces (of AX)
The extra registers (R8–R15) also have R8D/R8W/R8B forms for 32/16/8-bit access. (Stack Overflow)

Expansion of register size
The x64 gotcha I must remember (because it changes dataflow)
On x64, if an instruction writes to a 32-bit subregister (like EAX), it zero-extends into the full 64-bit register (RAX). Microsoft’s x64 register overview states this plainly. (learn.microsoft.com)
mov eax, 1
; implies rax = 00000000_00000001h
This matters in RE because it’s a clue:
a lot of
eax/ecx/edx/r8dactivity often means 32-bit integers / counters, not “full pointer” math.
What they’re “used for” (in the way I actually see them in malware RE)
SP/ESP/RSP: where the stack is right now
BP/EBP/RBP: commonly used as a stable frame reference (especially in x86; in x64 it depends on compiler/flags, but it still appears a lot in RE)
2) Pointers + indexes: SI/DI and modern equivalents
SI/ESI/RSI and DI/EDI/RDI: frequently become “this is a pointer to a buffer” registers in decompiled/disassembled code (even if it’s not string ops)
3) “Return value lives here” intuition
On x64 (and commonly in many conventions), RAX is the return register; Windows x64 docs treat RAX as a key volatile register and it’s the one you watch after calls. (learn.microsoft.com)
A quick “evolution cheat sheet” (what I keep nearby)
8086 / 16-bit set
AX BX CX DX SI DI BP SP (Wikipedia)
32-bit IA-32 set (the same, widened)
EAX EBX ECX EDX ESI EDI EBP ESP (docs.oracle.com)
64-bit x86-64 set (widened + extra)
RAX RBX RCX RDX RSI RDI RBP RSP R8 R9 R10 R11 R12 R13 R14 R15 (learn.microsoft.com)
Where I’m going next
Next I want to write up the part that actually makes control flow readable in malware:
flags (ZF/CF/SF/OF) +
cmp/test+ conditional jumps