Bootsector

Written by Amar Singh

‘How do you write a program that can run without any compilers, package managers, libraries or any user interface?’ in short but simple text.

img

Program without any OS

For inspiration, see the book ‘How to write an OS from Scratch’ which tries to answer that question.

If you intend to follow the book, you will probably need to have:

  1. A text editor, that can support binary files. Emacs: M-x hexl-mode “Hex Mode for binary files”
             M-x asm-mode "Assembly Mode"
  2. An assembler, such as yasm or nasm for x86 and amd64.
  3. Emulation software, such as Qemu

Each computer is designed with an ISA in mind. An ISA is a bridge between the computer hardware and computer software, you can think of ISA as the API for the underlying hardware. Examples of ISA are intel x86, amd64, armhf, aarch64, or the brand new and shiny RISC-V.

But most computers also come with much more than just the ISA, for example BIOS can provide utilities such as typewriter, mouse, hard disk and usb support, perhaps for the bootloader or OS. BIOS can also provide basic text printing which will help one get further ahead in their bootsector hacking, without worrying about writing your own drivers from the get go.

Assembly

If you can look beyond the syntax, assembly language (Note: assembly is not one universal language, but is specific to each and every ISA) does have a lot of resemblance to various programming languages.

mov ah, 0x0e
mov al, 'A'
int 0x10
  1. Prefix notation for procedures, but here we call them instructions. Just adding parenthesis: (mov al #\A)

  2. Values, Hex notation for directly writing binary values, quoted characters, double quote for strings.

  3. Bios Interrupt: Interrupt 10H or 0x10, for calling BIOS facilites for character output. Scheme and Emacs Lisp also has a similar feature in ‘system’ and ‘shell-command’ procedures respectively, in that they let you use the utilities already provided by the system.

    jmp $

  4. A jump instruction which will unconditionally move the processor to read and execute code from a different location.

  5. ‘$’ syntax refers to the current position in code.

The net effect of this statement will hang our machine looping on this particular piece of code forever, but without crashing. If the program continues to read from the memory beyond our program, there might be some random bits in the memory, which might translate to some dangerous instruction like ‘delete everything on the hard-drive’.

;; padding
times 510 - ($ - $$) db 0

To write a sucessful bootloader program it’s essential to not have random bits, so we pad the rest of the program until we have filled 510 bytes after our program with 0s.

We then, need to tag our program so that the BIOS knows that this is a bootloader, this is achieved by having the last 2-bytes of 512 bytes of the bootloader set to 0xaa55.

;; magic number
dw 0xaa55

Now the computer will recognise our program as a bootloader.

Save this to a file, say ‘boot.asm’ and

yasm -fbin boot.asm -o boot.bin

Code is Data

For the code mentioned above, we can make a slight change after inspecting the compiled boot.bin with Emacs hexl-mode.

jmp $

We change this line to:

jmp 0x06

Essentially, we are able to reference the running code as if we had it stored as data in some variable.