Protected mode works but is ugly, CS, SS work, failed lidt real mode attempt

This commit is contained in:
Ciro Santilli
2015-10-17 00:03:05 +02:00
parent b9b4da0d1e
commit febbb83254
20 changed files with 585 additions and 241 deletions

View File

@@ -1,18 +1,18 @@
.POSIX:
ASM_EXT ?= .asm
S_EXT ?= .S
LD ?= ld
LINKER_SCRIPT ?= linker.ld
# Use gcc so that the preprocessor will run first.
GAS ?= gcc
GAS_EXT ?= .S
NASM_EXT ?= .asm
OBJ_EXT ?= .o
OUT_EXT ?= .img
QEMU ?= qemu-system-i386
RUN ?= bios_hello_world
TMP_EXT ?= .tmp
OUTS := $(foreach IN_EXT,$(ASM_EXT) $(S_EXT),$(patsubst %$(IN_EXT),%$(OUT_EXT),$(wildcard *$(IN_EXT))))
OUTS := $(foreach IN_EXT,$(NASM_EXT) $(GAS_EXT),$(patsubst %$(IN_EXT),%$(OUT_EXT),$(wildcard *$(IN_EXT))))
RUN_FILE := $(RUN)$(OUT_EXT)
.PRECIOUS: %$(OBJ_EXT)
@@ -23,10 +23,10 @@ all: $(OUTS)
%$(OUT_EXT): %$(OBJ_EXT) $(LINKER_SCRIPT)
$(LD) --oformat binary -o '$@' -T '$(LINKER_SCRIPT)' '$<'
%$(OBJ_EXT): %$(S_EXT)
%$(OBJ_EXT): %$(GAS_EXT)
$(GAS) -c -o '$@' '$<'
%$(OUT_EXT): %$(ASM_EXT)
%$(OUT_EXT): %$(NASM_EXT)
nasm -f bin -o '$@' '$<'
clean:
@@ -37,7 +37,11 @@ run: all
debug: all
$(QEMU) -hda '$(RUN_FILE)' -S -s &
gdb -ex 'target remote localhost:1234' -ex 'break *0x7c00' -ex 'continue'
gdb \
-ex 'target remote localhost:1234' \
-ex 'set architecture i8086' \
-ex 'break *0x7c00' \
-ex 'continue'
bochs: all
# Supposes size is already multiples of 512.

View File

@@ -26,11 +26,14 @@ Hello world programs that run without an operating system.
1. [Initial state](initial_state.S)
1. [reboot](reboot.S)
1. [Not testable in userland](not-testable-in-userland.md)
1. [Segment registers real mode](segment_registers_real_mode.S)
1. [SS (TODO)](ss.S)
1. [Segment registers](segment_registers.S)
1. [SS](ss.S)
1. [CS](cs.S)
1. [Interrupt](interrupt.S)
1. [int $1](interrupt1.S)
1. [Interrupt zero divide](interrupt_zero_divide.S)
1. [Interrupt loop](interrupt_loop.S)
1. [lidt (TODO)](lidt.S)
1. in
1. [in keyboard](in_keyboard.S)
1. [in RTC](in_rtc.S)
@@ -39,6 +42,8 @@ Hello world programs that run without an operating system.
1. [in beep_illinois](in_beep_illinois.S)
1. [in mouse (TODO)](in_mouse.S)
1. [Protected mode](protected-mode.S)
1. Segmentation offset
1. Segmentation fault handler: memory bound, ring, RWX violations
1. APM
1. [APM shutdown](apm_shutdown.S)
1. [APM shutdown 2](apm_shutdown2.S)

16
TODO.md
View File

@@ -23,14 +23,15 @@
- https://github.com/torvalds/linux/blob/v4.2/arch/x86/boot/boot.h#L78
- http://stackoverflow.com/questions/6793899/what-does-the-0x80-port-address-connects
- lgdtl
- lgdt:
- http://stackoverflow.com/questions/21128311/the-physical-address-of-global-descriptor-table
- http://stackoverflow.com/questions/7415515/problem-accessing-control-registers-cr0-cr2-cr3
- http://stackoverflow.com/questions/10671147/how-do-x86-page-tables-work?rq=1
- http://stackoverflow.com/questions/14354626/how-to-create-two-separate-segments-in-global-descriptor-table
- http://stackoverflow.com/questions/14812160/near-and-far-jmps
- lidtl, interrupts, IDTR
- lidt, interrupts, IDTR:
- http://stackoverflow.com/questions/3392831/what-happens-in-an-interrupt-service-routine
- http://stackoverflow.com/questions/1817577/what-does-int-0x80-mean-in-assembly-code
@@ -44,15 +45,18 @@
- Segment registers and protected mode. Then try to answer all of:
http://stackoverflow.com/questions/18736663/what-does-the-colon-mean-in-x86-assembly-gas-syntax-as-in-dsbx
- http://reverseengineering.stackexchange.com/questions/2006/how-are-the-segment-registers-fs-gs-cs-ss-ds-es-used-in-linux
- http://stackoverflow.com/questions/10810203/what-is-the-fs-gs-register-intended-for
- http://stackoverflow.com/questions/12760109/data-segment-in-x86-programs
- http://stackoverflow.com/questions/14480579/when-does-segment-registers-change
- http://stackoverflow.com/questions/14661916/gdt-segmented-memory
- http://stackoverflow.com/questions/15335003/x86-protected-mode-segment-registers-purpose
- http://stackoverflow.com/questions/17210620/assembler-calculating-a-memory-address-with-register-base?lq=1
http://stackoverflow.com/questions/18736663/what-does-the-colon-mean-in-x86-assembly-gas-syntax-as-in-dsbx
- http://stackoverflow.com/questions/18247106/implementing-gdt-with-basic-kernel
- http://stackoverflow.com/questions/20717890/how-to-interpret-gs0x14?lq=1
- http://stackoverflow.com/questions/22446104/do-the-x86-segment-registers-have-special-meaning-usage-on-modern-cpus-and-oses?lq=1
- http://stackoverflow.com/questions/22962251/how-to-enter-64-bit-mode-on-a-x86-64/22963701#22963701
- http://stackoverflow.com/questions/26058665/fs-register-in-assembly-code?lq=1
- http://stackoverflow.com/questions/3819699/what-does-ds40207a-mean-in-assembly
- http://stackoverflow.com/questions/4903906/assembly-using-the-data-segment-register-ds answer with minimal example and mention QEMU vs real hardwre http://wiki.osdev.org/Segmentation
@@ -60,11 +64,9 @@
- http://stackoverflow.com/questions/5364270/concept-of-mov-ax-cs-and-mov-ds-ax?lq=1
- http://stackoverflow.com/questions/6611346/amd64-fs-gs-registers-in-linux
- http://stackoverflow.com/questions/7844963/how-to-interpret-segment-register-accesses-on-x86-64?lq=1
- http://stackoverflow.com/questions/928082/why-does-the-mov-instruction-have-to-be-used-this-way?lq=1
- http://stackoverflow.com/questions/14661916/gdt-segmented-memory
- http://stackoverflow.com/questions/18247106/implementing-gdt-with-basic-kernel
- http://stackoverflow.com/questions/9172837/idt-without-gdt-using-grub
- http://stackoverflow.com/questions/22962251/how-to-enter-64-bit-mode-on-a-x86-64/22963701#22963701
- http://stackoverflow.com/questions/9249315/what-is-gs-in-assembly?rq=1
- http://stackoverflow.com/questions/928082/why-does-the-mov-instruction-have-to-be-used-this-way?lq=1
64-bit:

View File

@@ -57,6 +57,14 @@ The following did not work on my machine out of the box:
- <http://www.brokenthorn.com/Resources/OSDevIndex.html>
- <http://skelix.net/skelixos/index_en.html>
Cleaned up version: <https://github.com/cirosantilli/skelix-os>
Not tested yet.
GAS based, no GRUB needed.
## Actually useful
These are not meant as learning resources but rather as useful programs:

View File

@@ -37,7 +37,7 @@ The most common modes seem to be:
- 0x03: 80x25 Text, 16 colors, 8 pages
- 0x13: 320x200 Graphics, 256 colors, 1 page
You can add 128 to the modes to avoid clearing the screen.
You can add 128 to the modes to prevent them from clearing the screen.
Taken from: <https://courses.engr.illinois.edu/ece390/books/labmanual/graphics-int10h.html>

View File

@@ -1,4 +1,8 @@
/*
# Bios disk load
# int 13h
Load one more sector from the disk
besides the first 512 bytes and do something with it.
@@ -19,9 +23,9 @@ BIOS call used for disk operations.
## Bibliography
- https://en.wikipedia.org/wiki/INT_13H
- http://wiki.osdev.org/ATA_in_x86_RealMode_%28BIOS%29
- https://thiscouldbebetter.wordpress.com/2011/03/15/creating-a-bootable-program-in-assembly-language/
- https://en.wikipedia.org/wiki/INT_13H
- http://stackoverflow.com/questions/19381434/cannot-read-disk-sectors-in-assembly-language
- http://stackoverflow.com/questions/15497842/read-a-sector-from-hard-drive-with-int-13h
*/
@@ -58,7 +62,14 @@ BEGIN
mov $0, %dh
/* Starting sector number. 2 because 1 was already loaded. */
mov $2, %cl
/* Where to load to. Must coincide with our stage2 for the linking to work. */
/*
Where to load to.
Must coincide with our stage2 for the linking to work.
The address is calculated as:
16 * ES + BX
*/
mov $stage2, %bx
int $0x13

View File

@@ -9,16 +9,52 @@ The big ones do bloat the executable.
#define BEGIN \
.code16 ;\
cli ;\
/* This sets %cs to 0. TODO Is that really needed? */ ;\
ljmp $0, $1f;\
1:;\
xor %ax, %ax ;\
/* We must zero %ds for any data access.. */ \
/* We must zero %ds for any data access. */ \
mov %ax, %ds ;\
/* TODO is this really need to clear all those segment registers, e.g. for BIOS calls? */ \
mov %ax, %es ;\
mov %ax, %fs ;\
mov %ax, %gs ;\
/* TODO What to move into BP and SP? http://stackoverflow.com/questions/10598802/which-value-should-be-used-for-sp-for-booting-process */ \
mov 0x0000, %bp ;\
/* Disables interrupts until the end of the next instruction. */ \
/* Automatically disables interrupts until the end of the next instruction. */ \
mov %ax, %ss ;\
/* We should set SP because BIOS calls may depend on that. TODO confirm. */ \
mov %bp, %sp
/*
Load stage2 from disk to memory, and jump to it.
TODO not working.
To be used when the program does not fit in the 512 bytes.
Sample usage:
STAGE2
Stage 2 code here.
*/
#define STAGE2 \
mov $2, %ah;\
/* TODO get working on linker script. Above my paygrade for now, so I just load a bunch of sectors instead. */;\
/* mov __stage2_size, %al;\ */;\
mov $9, %al;\
mov $0x80, %dl;\
mov $0, %ch;\
mov $0, %dh;\
mov $2, %cl;\
mov $1f, %bx;\
int $0x13;\
jmp 1f;\
.section .stage2;\
1:
/* BIOS */
#define CURSOR_POSITION(x, y) \
mov $0x02, %ah;\
mov $0x00, %bh;\
@@ -130,27 +166,4 @@ loop:
end:
.endm
/*
Load stage2 from disk to memory, and jump to it.
TODO not working?
To be used when the program does not fit in the 512 bytes.
Sample usage:
STAGE2
Stage 2 code here.
*/
#define STAGE2 \
mov $2, %ah;\
mov __stage2_size, %al;\
mov $0x80, %dl;\
mov $0, %ch;\
mov $0, %dh;\
mov $2, %cl;\
mov $1f, %bx;\
int $0x13;\
jmp 1f;\
.section .stage2;\
1:
/* VGA */

31
cs.S Normal file
View File

@@ -0,0 +1,31 @@
/*
# CS segment register
# ljmp
Expected outcome: "0102" get printed. Those are the 2 that CS takes in this example.
Explanation: http://stackoverflow.com/a/33177253/895245
TODO is ljmp encodable except with a constant:
- http://stackoverflow.com/questions/1685654/ljmp-syntax-in-gcc-inline-assembly
*/
#include "common.h"
BEGIN
CLEAR
ljmp $1, $1f
1:
.skip 0x10
mov %cs, %ax
PRINT_HEX(%al)
ljmp $2, $1f
1:
.skip 0x20
mov %cs, %ax
PRINT_HEX(%al)
hlt

View File

@@ -21,14 +21,12 @@ INITIAL_STORE(ax)
INITIAL_STORE(bx)
INITIAL_STORE(cx)
INITIAL_STORE(dx)
/*
INITIAL_STORE(cs)
INITIAL_STORE(ds)
INITIAL_STORE(es)
INITIAL_STORE(fs)
INITIAL_STORE(gs)
INITIAL_STORE(ss)
*/
BEGIN
@@ -38,18 +36,12 @@ INITIAL_PRINT(ax)
INITIAL_PRINT(bx)
INITIAL_PRINT(cx)
INITIAL_PRINT(dx)
/*
TODO this breaks if I add more code here.
Linked to STAGE2 load I imagine.
*/
/*
INITIAL_PRINT(cs)
INITIAL_PRINT(ds)
INITIAL_PRINT(es)
INITIAL_PRINT(fs)
INITIAL_PRINT(gs)
INITIAL_PRINT(ss)
*/
hlt
@@ -57,11 +49,9 @@ INITIAL_DATA(ax)
INITIAL_DATA(bx)
INITIAL_DATA(cx)
INITIAL_DATA(dx)
/*
INITIAL_DATA(cs)
INITIAL_DATA(ds)
INITIAL_DATA(es)
INITIAL_DATA(fs)
INITIAL_DATA(gs)
INITIAL_DATA(ss)
*/

View File

@@ -1,21 +1,30 @@
/*
# Interrupt
Minimal interrupt example.
Should print the characters 'ab' to screen.
Expected outcome: 'ab' gets printed to the screen.
TODO: is STI not needed because this interrupt is not maskable?
TODO: use IDTR as a base. Is the initial value 0 guaranteed?
TODO: interrupt priority: order looks like: 0, 1, 2, 8, 9, 10, 11, 12, 13, 14, 15, 3, 4, 5, 6, 7. What is that?
TODO: interrupt priority: order looks like: 0, 1, 2, 8, 9, 10, 11, 12, 13, 14, 15, 3, 4, 5, 6, 7
## int
What it does:
- long jumps to the CS : IP found in the corresponding interrupt vector.
- also pushes EFLAGS. Why? To let them be restored by iret?
## iret
Returns to the next instruction to be executed
before the interrupt came in.
TODO I think this is mandatory, e.g. a `jmp` wouldn't be enough.
Otherwise what?
I think this is mandatory, e.g. a `jmp` wouldn't be enough because:
- we may have far jumped
- iret also pops EFLAGS restoring
## ISR
@@ -24,6 +33,25 @@ Otherwise what?
Fancy name for the handler.
http://wiki.osdev.org/Interrupt_Service_Routines
## Interrupt descriptor table
## IDTR
## Interrupt descriptor table register
IDTR points to the IDT.
The IDT contains the list of callbacks for each interrupt.
This name seems to be reserved to 32-bit protected mode, IVT is the 16-bit term.
## IVT
http://wiki.osdev.org/IVT
osdev says that the default address is 0:0, and that it shouldn't be changed by LIDT,
as it is incompatible with older CPUs.
*/
#include "common.h"
@@ -33,9 +61,7 @@ BEGIN
mov %cs, 0x02
int $0
PUTC($0x62)
jmp end
hlt
handler:
PUTC($0x61)
iret
end:
hlt

17
interrupt1.S Normal file
View File

@@ -0,0 +1,17 @@
/*
Test an interrupt handler different than 0.
Expected outcome: 'ab' gets printed to the screen.
*/
#include "common.h"
BEGIN
CLEAR
movw $handler, 0x04
mov %cs, 0x06
int $1
PUTC($0x62)
hlt
handler:
PUTC($0x61)
iret

43
lidt.S Normal file
View File

@@ -0,0 +1,43 @@
/*
# lidt
TODO get working:
- http://wiki.osdev.org/Real_Mode
Sets the IDTR through a from a descriptor in memory, and tells the CPU where the IDT is on memory.
Expected outcome: 'ab' gets printed to the screen.
osdev says this is not compatible with older CPUs.
# sidt
Read the descriptor register to memory.
*/
#include "common.h"
BEGIN
CLEAR
lidt idt_descriptor
movw $handler, 0x04
mov %cs, 0x06
int $0
PUTC($0x62)
hlt
idt:
.word 2
.word 4
idt_end:
idt_descriptor:
.word idt_end - idt
.long idt
handler:
PUTC($0x61)
iret

View File

@@ -31,6 +31,8 @@ SECTIONS
*(.stage2)
/*
TODO get this working.
Number of sectors in stage 2. Used by the `int 13`.
The value gets put into memory as the very last thing
@@ -39,9 +41,10 @@ SECTIONS
We must put it *before* the final `. = ALIGN(512)`,
or else it would fall out of the loaded memory.
*/
__stage2_size = (ALIGN(.) / 512) - 1;
__stage2_size = .;
/*BYTE((ALIGN(.) / 512) - 1);*/
/* Ensure that the generated image is a multiple of 512. */
/* Ensure that the generated image is a multiple of 512 bytes long. */
. = ALIGN(512);
}
}

View File

@@ -3,3 +3,5 @@
While NASM is a bit more convenient than GAS to write a boot sector, I think it is just not worth it.
When writing an OS in C, we are going to use GCC, which already uses GAS. So it's better to reduce the number of assemblers to one and stick to GAS only.
Right now, this directory is not very DRY since NASM is secondary to me, so it contains mostly some copy / paste examples.

View File

@@ -1,138 +0,0 @@
; 2-stage protected mode entry.
;
; Initially taken from:
; https://thiscouldbebetter.wordpress.com/2011/03/17/entering-protected-mode-from-assembly/
use16
org 0x7C00 ; boot sector address
Boot:
;
mov ah,0x00 ; reset disk
mov dl,0 ; drive number
int 0x13
;
mov ah,0x02 ; read sectors into memory
mov al,0x10 ; number of sectors to read (16)
mov dl,0x80 ; drive number
mov ch,0 ; cylinder number
mov dh,0 ; head number
mov cl,2 ; starting sector number
mov bx,Main ; address to load to
int 0x13 ; call the interrupt routine
;
jmp Main
;
PreviousLabel:
PadOutWithZeroesSectorOne:
times ((0x200 - 2) - ($ - $$)) db 0x00
BootSectorSignature:
dw 0xAA55
;===========================================
Main:
;
; set the display to VGA text mode now
; because interrupts must be disabled
;
mov ax,3
int 0x10 ; set VGA text mode 3
;
; set up data for entering protected mode
;
xor edx,edx ; edx = 0
mov dx,ds ; get the data segment
shl edx,4 ; shift it left a nibble
add [GlobalDescriptorTable+2],edx ; GDT's base addr = edx
;
lgdt [GlobalDescriptorTable] ; load the GDT
mov eax,cr0 ; eax = machine status word (MSW)
or al,1 ; set the protection enable bit of the MSW to 1
;
cli ; disable interrupts
mov cr0,eax ; start protected mode
;
mov bx,0x08 ; the size of a GDT descriptor is 8 bytes
mov fs,bx ; fs = the 2nd GDT descriptor, a 4 GB data seg
;
; write a status message
;
mov ebx,0xB8000 ; address of first char for VGA mode 3
;
mov si,TextProtectedMode ; si = message text
;
ForEachChar:
;
lodsb ; get next char
cmp al,0x00 ; if it's null, break
je EndForEachChar
;
mov [fs:ebx],al ; write char to display memory
;
inc ebx ; 2 bytes per char
inc ebx ; so increment twice
;
jmp ForEachChar
EndForEachChar:
;
LoopForever: jmp LoopForever
;
ret
;
TextProtectedMode: db 'The processor is in protected mode.',0
GlobalDescriptorTable:
; the global descriptor table is the heart of protected mode
; entries are used to map virtual to physical memory
; among other things
;
; each descriptor contains 8 bytes, "organized" as follows:
;
; |----------------------2 bytes--------------------|
;
; +-------------------------------------------------+
; | segment address 24-31 | flags #2 | len 16-19 | +6
; +-------------------------------------------------+
; | flags #1 | segment address 16-23 | +4
; +-------------------------------------------------+
; | segment address bits 0-15 | +2
; +-------------------------------------------------+
; | segment length bits 0-15 | +0
; +-------------------------------------------------+
; the high-order bit of flags #2 controls "granularity"
; setting it to 1 multiplies the segment length by 4096
;======================================================
; create two descriptors:
; one for the GDT itself, plus a 4 gibabyte data segment
dw GlobalDescriptorTableEnd - GlobalDescriptorTable - 1
; segment address bits 0-15, 16-23
dw GlobalDescriptorTable
db 0
; flags 1, segment length 16-19 + flags 2
db 0, 0
; segment address bits 24-31
db 0
; a data segment based at address 0, 4 gibabytes long
;
dw 0xFFFF ; segment length 0-15
db 0, 0, 0 ; segment address 0-15, 16-23
db 0x91 ; flags 1
db 0xCF ; flags 2, segment length 16-19
db 0 ; segment address 24-31
;
GlobalDescriptorTableEnd:
;===========================================
PadOutWithZeroesSectorsAll:
times (0x2000 - ($ - $$)) db 0x00

View File

@@ -0,0 +1,93 @@
; http://stackoverflow.com/a/28645943/895245
[bits 16]
[org 0x7c00]
mov ax, 0
mov ss, ax
mov sp, 0xFFFC
mov ax, 0
mov ds, ax
mov es, ax
mov fs, ax
mov gs, ax
cli
lgdt[gdt_descriptor]
mov eax, cr0
or eax, 0x1
mov cr0, eax
; TODO why is this needed?
jmp CODE_SEG:b32
[bits 32]
print32:
pusha
; Video memory address.
mov edx, 0xb8000
.loop:
mov al, [ebx]
; White on black.
mov ah, 0x0f
cmp al, 0
je .done
mov [edx], ax
add ebx, 1
add edx, 2
jmp .loop
.done:
popa
ret
b32:
mov ax, DATA_SEG
mov ds, ax
mov es, ax
mov fs, ax
mov gs, ax
mov ss, ax
mov ebp, 0x2000
mov esp, ebp
mov ebx, message
call print32
jmp $
gdt_start:
gdt_null:
dd 0x0
dd 0x0
gdt_code:
dw 0xffff
dw 0x0
db 0x0
db 10011010b
db 11001111b
db 0x0
gdt_data:
dw 0xffff
dw 0x0
db 0x0
db 10010010b
db 11001111b
db 0x0
gdt_end:
gdt_descriptor:
dw gdt_end - gdt_start
dd gdt_start
CODE_SEG equ gdt_code - gdt_start
DATA_SEG equ gdt_data - gdt_start
message db 'hello world', 0
[SECTION signature start=0x7dfe]
dw 0AA55h

View File

@@ -0,0 +1,115 @@
; 2-stage protected mode entry.
;
; Initially taken from:
; https://thiscouldbebetter.wordpress.com/2011/03/17/entering-protected-mode-from-assembly/
;
; This works on QEMU, but I don't trust the expertize of that author very much.
; E.g., this does not have any `[BITS 32]`?
use16
org 0x7C00
mov ah, 0x00
mov dl, 0
int 0x13
mov ah, 0x02
mov al, 0x10
mov dl, 0x80
mov ch, 0
mov dh, 0
mov cl, 2
mov bx, stage2
int 0x13
jmp stage2
times ((0x200 - 2) - ($ - $$)) db 0x00
dw 0xAA55
stage2:
; Set up data for entering protected mode.
xor edx, edx ; edx = 0
mov dx, ds ; get the data segment
shl edx, 4 ; shift it left a nibble
add [gdt+2], edx ; GDT's base addr = edx
lgdt [gdt] ; load the GDT
mov eax, cr0 ; eax = machine status word (MSW)
or al, 1 ; set the protection enable bit of the MSW to 1
cli
mov cr0, eax ; start protected mode
mov bx, 0x08 ; the size of a GDT descriptor is 8 bytes
mov fs, bx ; fs = the 2nd GDT descriptor, a 4 GB data seg
; Write a status message.
mov ebx, 0xB8000 ; address of first char for VGA mode 3
mov si, message ; si = message text
for_each_char:
lodsb
cmp al, 0x00
je end_for_each_char
mov [fs:ebx], al
inc ebx
inc ebx
jmp for_each_char
end_for_each_char:
loop_forever:
jmp loop_forever
ret
message: db 'hello world', 0
gdt:
; the global descriptor table is the heart of protected mode
; entries are used to map virtual to physical memory
; among other things
;
; each descriptor contains 8 bytes, "organized" as follows:
;
; |----------------------2 bytes--------------------|
;
; +-------------------------------------------------+
; | segment address 24-31 | flags #2 | len 16-19 | +6
; +-------------------------------------------------+
; | flags #1 | segment address 16-23 | +4
; +-------------------------------------------------+
; | segment address bits 0-15 | +2
; +-------------------------------------------------+
; | segment length bits 0-15 | +0
; +-------------------------------------------------+
; the high-order bit of flags #2 controls "granularity"
; setting it to 1 multiplies the segment length by 4096
;======================================================
; create two descriptors:
; one for the GDT itself, plus a 4 gibabyte data segment
dw gdt_end - gdt - 1
; segment address bits 0-15, 16-23
dw gdt
db 0
; flags 1, segment length 16-19 + flags 2
db 0, 0
; segment address bits 24-31
db 0
; a data segment based at address 0, 4 gibabytes long
dw 0xFFFF ; segment length 0-15
db 0, 0, 0 ; segment address 0-15, 16-23
db 0x91 ; flags 1
db 0xCF ; flags 2, segment length 16-19
db 0 ; segment address 24-31
gdt_end:
times (0x2000 - ($ - $$)) db 0x00

View File

@@ -11,13 +11,14 @@ Major changes:
- we have to encode instructions differently.
Note that in 16-bit 32-bit instructions were encodable, but with a prefix.
TODO get working.
## Bibliography
- http://stackoverflow.com/questions/28645439/how-do-i-enter-32-bit-protected-mode-in-nasm-assembly
- http://wiki.osdev.org/Journey_To_The_Protected_Land
- http://wiki.osdev.org/Protected_Mode
- https://github.com/chrisdew/xv6/blob/master/bootasm.S
- https://thiscouldbebetter.wordpress.com/2011/03/17/entering-protected-mode-from-assembly/ FASM based. Did not word on first try, but looks real clean.
- http://stackoverflow.com/questions/28645439/how-do-i-enter-32-bit-protected-mode-in-nasm-assembly
- http://skelix.net/skelixos/tutorial02_en.html
## GDT
@@ -25,10 +26,41 @@ Table in memory that gives properties of segment registers.
Segment registers in protected mode point to entries of that table.
The GDT modifies every memory access of a given segment by adding an offset to it.
GDT is used as soon as we enter protected mode, so that's why we have to deal with it, but the preferred way of managing program memory spaces is paging.
### Effect on memory access
The GDT modifies every memory access of a given segment by:
- adding an offset to it
- limiting how big the segment is
If an access is made at an offset larger than allowed: TODO some exception happens, which is like an interrupt, and gets handled by a previously registered handler.
The GDT could be used to implement virtual memory by using one segment per program:
+-----------+--------+--------------------------+
| Program 1 | Unused | Program 2 |
+-----------+--------+--------------------------+
^ ^ ^ ^
| | | |
Start1 End1 Start2 End2
The problem with that is that each program must have one segment, so if we have too many programs, fragmentation will be very large.
Paging gets around this by allowing discontinuous memory ranges of fixed size for each program.
The format of the GDT is given at: http://wiki.osdev.org/Global_Descriptor_Table
### Effect on permissions
Besides fixing segment sizes, the GDT also specifies permissions to the program that is running:
- ring level: limits several things that can or not be done, in particular:
- instructions: e.g. no in / out in ring 3
- register access: e.g. cannot modify control registers like the GDTR in ring 3. Otherwise user programs could just escape restrictions by changing that!
- executable, readable and writable bits: which operations can be done
## GDTR
## GDT register
@@ -45,29 +77,94 @@ GRUB seems to setup one for you: http://www.jamesmolloy.co.uk/tutorial_html/4.-T
#include "common.h"
BEGIN
cli
/* Set the GDT register with start address of Global Descriptor Table */
lgdt gdt
mov %cr0, %eax
CLEAR
cli
/* Tell the processor where our Global Descriptor Table is in memory. */
lgdt gdt_descriptor
/* Set PE (Protection Enable) bit in CR0 (Control Register 0) */
or $1, %al
mov %cr0, %eax
orl $0x1, %eax
mov %eax, %cr0
/* TODO why does equ not work? What is the alternative? */
/*ljmp $CODE_SEG, $b32*/
/*
TODO why 8?
This is needed to set `%cs` to 8.
Perform far jump to selector 08h (offset into GDT,
pointing at a 32bit PM code segment descriptor)
to load CS with proper PM32 descriptor).
8 means the second entry, since each entry is 8 bytes wide,
and we have an initial null entry.
*/
ljmp $0x08, $PModeMain
ljmp $0x08, $b32
PModeMain:
/* TODO load DS, ES, FS, GS, SS, ESP. */
.code32
hlt
print32:
pusha
# Video memory address.
mov $0xb8000, %edx
print32.loop:
mov (%ebx), %al
# White on black.
mov $0x0f, %ah
cmp $0, %al
je print32.done
mov %ax, (%edx)
add $1, %ebx
add $2, %edx
jmp print32.loop
print32.done:
popa
ret
gdt:
.word 0x1234
.long 0x12345678
b32:
/* Setup the other segments. */
mov $DATA_SEG, %ax
mov %ax, %ds
mov %ax, %es
mov %ax, %fs
mov %ax, %gs
mov %ax, %ss
/* TODO: what is the maximum we can put were? Why does `FFFFFFFF` fail? */
mov $0XFFFFFF00, %ebp
mov %ebp, %esp
mov $message, %ebx
call print32
jmp .
gdt_start:
gdt_null:
.long 0x0
.long 0x0
gdt_code:
.word 0xffff
.word 0x0
.byte 0x0
.byte 0b10011010
.byte 0b11001111
.byte 0x0
gdt_data:
.word 0xffff
.word 0x0
.byte 0x0
.byte 0b10010010
.byte 0b11001111
.byte 0x0
gdt_end:
gdt_descriptor:
.word gdt_end - gdt_start
.long gdt_start
.equ CODE_SEG, gdt_code - gdt_start
.equ DATA_SEG, gdt_data - gdt_start
message: .asciz "hello world"

View File

@@ -1,5 +1,5 @@
/*
# Segment registers real mde
# Segment registers
Show how most segment registers work in 16-bit real mode.
@@ -11,10 +11,17 @@ I think their goal was to implement process virtualization in the past.
The special semantics of other registers will be covered in other files.
Rationale of the registers:
- extend other registers. For e.g., http://stackoverflow.com/questions/17777146/what-is-the-purpose-of-cs-and-ip-registers-intel-8086
- rudimentary virtual process spaces
## ES
TODO: this does seem to have special properties as used by string instructions.
- used by the `int 13h` BIOS disk read service
## FS
## GS
@@ -63,6 +70,7 @@ This makes `ds` the most efficient one for data access, and thus a good default.
#include "common.h"
BEGIN
CLEAR
/*
It is not possible to encode moving immediates
to segment registers: we must either:
@@ -82,20 +90,6 @@ BEGIN
mov msg, %al
PUTC(%al)
/* Try using other segment as well. */
/*
CS is the exception: if we do this, the program halts. TODO why?
Some info: http://wiki.osdev.org/Segmentation#Operations_that_affect_segment_registers
*/
/*
mov $1, %ax
mov %ax, %cs
mov %cs:msg, %al
PUTC(%al)
*/
mov $1, %ax
mov %ax, %es
mov %es:msg, %al

36
ss.S
View File

@@ -1,10 +1,38 @@
/*
TODO implement.
# SS segment register
SS register. I think the major effect of it is that anything that uses
`SP` like `PUSH` and `POP`, will actually use `SS:SP`.
Expected output: "0102".
I think the major effect of it is that anything that uses
`SP` like `PUSH` and `POP`, will actually use 16 * SS + SP instead.
*/
#include "common.h"
BEGIN
hlt
/* Save the good sp for later. */
mov %sp, %bx
/* Control group: ss == 0. */
mov $stack, %sp
pop %ax
/* Restore the old stack so that it won't mess with our othe functions. */
mov %bx, %sp
PRINT_HEX(%al)
/* Now let's move ss and see if anything happens. */
mov $1, %ax
mov %ax, %ss
mov $stack, %sp
/* This pop should happen 16 bytes higher than the first one. */
pop %ax
mov %bx, %sp
PRINT_HEX(%al)
hlt
stack:
.word 1
/* 2 bytes from the word above + 14 = 16 */
.skip 14
/* This is at stack0 + 16 */
.word 2