AstatineOS: Making my own operating system step by step
While it's incredibly basic right now, AstatineOS is a fun project that I'm working on to learn more about lower-level computer operations.
12/17/2025
Hey there! I've been working on a project called AstatineOS, which is an operating system written from scratch in C. It's still in its early stages, and I actually recently renamed it from NetworkOS to AstatineOS, since I felt the name NetworkOS didn't give it too much justice as to what I actually want to achieve. You can find the video I made about NetworkOS here.
The video above shows NetworkOS, but in all honesty, the old version from last January wasn't a real operating system, or at least an operating system by today's definitions. Wikipedia defines an operating system as seen below.
An operating system (OS) is system software that manages computer hardware and software resources, and provides common services for computer programs.By that definition, this original version is actually a computer program: the original version ran everything in kernel mode and provided no way for programs compiled for the platform to be built. The included 3 programs, which were a game, a notes app, and a basic interpreter, were compiled alongside the kernel and had hard dependencies on kernel functions.
Unlike Apple, I don't want to set a double standard for the code I write versus the code someone else might compile for my program. Also unlike Apple, I'm not a trillion dollar company with a bunch of engineers 💀. I ended up getting burnout from working on this project in January, and didn't really touch it all that much over the summer. I worked on other stuff, like robotics and the FBLA app.
However, in October and November, I suddenly found some inspiration to work on it again. From where, you might ask? Hack Club (which always seems to be the answer doesn't it?) With this newfound motivation, I created a list of all the features I wanted to add to turn this from "barely a program" to "barely a program RUNNER". Below is this list.
- Filesystem (somewhere to read and write files and programs)
- Compiler and compilation toolchain (to build programs)
- Drivers for basic hardware (if a program has no input or output, did it truly run? words to think about)
- Memory management and paging (so programs can be loaded anywhere)
- Program loader (so programs can be loaded and run)
Filesystem
Let's start with the filesystem. At this point in the code, AstatineOS was nothing more than a flat binary and a bootloader. If you took a 512-byte piece of code, that would load 20KB of data into memory and run it, you would have AstatineOS as it was in January.The issues with this were obvious: if the code was larger than 20KB, it wouldn't run. If you wanted to store files, you couldn't. You want to dual-boot? Forget about it. Don't forget that any random person could overwrite your code by writing to the disk directly, so you could either boot into literal garbage, or a program that destroys your computer. Doesn't seem like the kind of code you want on there.
Now, while we could just throw a FAT32 partition into our existing image and call it a day, there's a couple of problems with that as well (nothing but problems!!!). FAT32 needs a driver to read and write files, and we only have 512 bytes to work with in our bootloader. Other bootloaders, like GRUB, take a lot more than 512 bytes to fit different filesystem drivers, but we don't have that luxury.
Or maybe we do? You see, the bootloader our BIOS loads is 512 bytes. What we load from there is up to us. What we can do is have our bootloader load a SECOND bootloader that has more space to work with. This second bootloader can then load a FAT32 driver and read files from there. This is a very simple explanation of what I ended up doing.
The technical bits
We can't necessarily guarantee that the first bootloader will be our own bootloader; it can be likely that the user sets up GRUB for dualbooting, or Windows goes in and replaces our bootloader with its own. For that reason, we need a bootloader system that doesn't rely on our own bootloader being present. The current system is the opposite of that since we rely on our own bootloader to load the 20kb of fixed data.We also can't directly load FAT32, because again, we have no driver for it. More importantly, though, we need a section of code that sets up the environment for our OS to run in. GRUB is good, but it doesn't know everything about our OS, like the kind of memory management we want, or the kind of segmentation we want. For that reason, we can create an intermediate boot partition that has a custom filesystem that sets up such environment. I called this custom filesystem "Astatine Boot Partition".
Astatine Boot Partition (ABP) is very simple, then. Like any partition, we have a 512-byte VBR, or volume boot record, at the top, with some metadata about the partition. After that, we have all the raw data that we need to parse out. In ABP, the first sectors are a table of file entries on the system, aligned to the nearest sector. Each file entry contains the filename (which is fixed length), the starting address, the length in sectors, etc. After this file table, we have our file data.
Now, we can set up a partition table and set our first partition as an ABP. When a bootloader, not necessarily our own, loads the VBR, the VBR can then scan the file table for a file called "BOOT.AEX", which is our second bootloader. Once it finds it, it loads it into memory and runs it. Otherwise, we hang indefinitely. Note that to even create an ABP, we need a BOOT.AEX file.
The BOOT.AEX file is a 16-bit program that sets up an environment for our OS to run in. This includes stuff like providing it with certain values, like the partition we booted from, the map of memory available on the system, etc. It also jumps our code from 16-bit to 32-bit mode, so our OS can run in protected mode.
The above code is available as v0.0.2-alpha on the Github.
Coming back to the point
I realise that I was talking about implementing a FAT32 filesystem driver, but instead I talked about making some random ahh filesystem, so my bad. Coming back to the point, now that we have an intermediate bootloader, we can now add a FAT32 loader to our ABP. Now, our BOOT.AEX file is, by definition, an assembly program, and writing a FAT32 driver in assembly would be worse than trying to lift a car with a surgical scalpel. For that reason, our BOOT.AEX file actually loads a LOADER.AEX file on the ABP, which is compiled from C code. This LOADER.AEX file contains an embedded FAT32 driver from this really helpful repo online from strawberryhacker.This is a minimal FAT32 library written in C. No external dependencies. A little over 1K lines of code. Binary size is around 8KiB (ARM) when all functionality is used. It is not threadsafe. - strawberryhacker/fat32With this FAT32 driver, we can read our OS from a FAT32 partition. Similar to how the Linux kernel runs, we can store our kernel as a single binary image that is loaded into memory and run it. This is how we now have filesystem support with our operating system. Or do we?
What the hell kinda disk???
We have a FAT32 driver, but we don't actually have any way to read off of our disk??? You see, before, we could rely on int 13h BIOS calls to read data from the disk. But since our FAT32 driver runs in protected mode, we can't rely on BIOS calls anymore. This means that we need to add a disk driver to our LOADER.AEX. One of the easier disk drivers to use is the IDE, or the "ATA PIO" driver, which just requires ISA port IO. What this also does is limit our ability to boot from any disk that isn't an IDE disk or a SATA disk with IDE emulation. Real hardware is unfortunately gone.Forgetting the final step...
While our OS now boots from a FAT32 partition, our OS doesn't actually use it at all. This is extremely easy to fix, however. All we now need to do is include the exact same FAT32 driver into our OS code, and use it to read a couple of sample files from the FAT32 partition. This way, our OS can now read and write files from the filesystem.And that's filesystem support! You can find this release as v0.0.3-alpha on the Github.
Compiler and toolchain
Now, if we want to make progress on actually running programs on our OS, we need some way for programs to be built for it. Now, while we could just build kernel-mode programs that have full access to the hardware, that would be a security nightmare, since some idiot could just do this to the computer.
inb $0x64, al
outb $0x64, $0xfe
hlt
which basically tells the computer to immediately shutdown. Not very user-friendly, is it?
For that reason, we need to have user-mode programs that have limited access to the hardware. These programs can only communicate with the hardware through system calls, or special interrupts, to the kernel, which then processes the request and returns the result. This way, we can have some level of security.
Based on this short description, it's clear we need a lot more than just a compiler to run programs, but that's future me's problem!
GCC: the goat
Continuing from the last section, we need a C compiler to compile our C programs to compiled programs. The best one to use is GCC. Now, I'm going to be real with you: I just used the OSDev wiki's guide on setting up cross-compilation with GCC. I was already using a cross-compiler to compile my code to x86 (since I'm on a Mac), so I had to set up a custom toolchain that would compile user-space code for AstatineOS.Before that though, we need to define the libc for our OS. The libc is the standard C library that provides basic functions that communicate with our syscalls. While you could theoretically spend a couple months working on your own custom libc from scratch, and eventually get there, it's way easier to just use an existing one. I ended up going with newlib: a GNU libc implementation for embedded systems. It's much more simple than other libcs, as I can just implement the syscalls I have implemented so far, and leave the rest unimplemented (spoiler alert: I did).
With a libc, we can now create our target for GCC as i686-astatine-gcc and i686-astatine-bintools. With that, we can create C programs that run purely in user-space.
Drivers for basic hardware
Now, we have the ability to compile programs. But what can these programs do? Basically this:int main() {
while(1);
}
Not very useful, is it? For that reason, we need to add drivers for basic hardware, like the keyboard and
the screen. This way, programs can at least interact with the user.
Now, at first glance, this seems pretty simple. We already have an interface with which we can read and write to things like the screen and keyboard. However, doing this in a way that isn't completely awful is a different story. Good things in life take time, and writing a good driver subsystem is just like that.
Now, to test if we can even write drivers, we can create a very simple syscall for writing to the screen. This syscall simply takes the string passed through it and writes it to the screen using our existing VGA text mode driver.
#include <stdio.h>
int main() {
printf("Hello, World!\n");
return 0;
}
This simple program now works! We can write to the screen from user-space programs.
Memory and paging???
You'll notice I didn't link a Github release for the last 2 sections. This is because the code above? All it does is compile. The issue with this code is that if we compile a program using GCC like this:i686-astatine-gcc -o hello.aex hello.c
it'll create an ELF file that has certain expectations about where in memory it will be loaded. By default, GCC
links programs to be loaded at 0x08048000, which is where Linux loads programs. However, the virtual machine
I'm using to test AstatineOS does NOT have enough memory to load programs at that address linearly. Neither
do most real computers.
This is where virtual memory and paging come into play. Paging is the concept that each physical memory address can be considered as a 4KB page, and that different virtual addresses can map to different physical pages. This has insane usages in the real world. Let's say you have a program that needs to load 100MB of data, but you don't have 100MB of consecutive physical memory. With paging, you can map non-consecutive physical pages to consecutive virtual pages like
Virtual Address 0x00400000 -> Physical Address 0x00123000
Virtual Address 0x00401000 -> Physical Address 0x00234000
Virtual Address 0x00402000 -> Physical Address 0x00056000
This way, the program has all the memory it needs, while the OS can manage physical memory more efficiently. Also,
when it comes to switching programs, or if the OS runs out of memory, it can just write pages of memory to disk and
free up physical memory for other programs. This is the basis of every modern operating system since at least the
90s, if not before.
The code for this is simple. At 0x1000, we set up our page directory, which contains 1024 entries that map out all 4 GB of physical memory available in 32-bit mode. Each entry in the page directory points to a page table, which contain 1024 entries that individually control 4 KB of memory. We can configure the first 4 megabytes of memory to be kernel-memory only, and the rest we can leave unmapped.
With this method, any time we write to a section of memory that isn't mapped, or that the user shouldn't access, we can get a page fault, helping debug issues in our OS much faster (and trust me there were a ton).
Program loader
Now, the final step to get programs working is to create a program loader that can read ELF files, allocate the sections of memory they need to be in, load their data, and run them. This is actually pretty annoying, because C has no runtime protection, so it's really easy to make a mistake reading an ELF file that causes the entire reader to fail and not load memory properly.This ended up happening later, but I'll explain that later in the video. For now, all we have to do is use the OSDev guide for ELF Loading and load our program sectors into memory properly. Once we do that, we can watch back and enjoy this beautiful sight:
Hello, World!
And there you have it! AstatineOS can now run user-space programs. You can find this release as v0.0.4-alpha on the Github.
You didn't think we were done, did you now?
Let's start with v0.0.4-alpha.1. I wanted to include the ability to read from the keyboard, so I created a syscall for that as well.Explaining the syscalls better
The way syscalls worked after the original test was a lot better than before. We still use interrupts, but like Linux, we now have syscalls that take in file descriptors. These file descriptors are kept track of by the kernel, which then maps them to actual hardware devices. For this basic example, FD 0, 1, and 2 were mapped to standard input, output, and error, respectively.With this in mind, we can create 3 file descriptors when a program is launched using a simple function:
void terminal_install() {
terminal_fops.read = terminal_read;
terminal_fops.write = terminal_write;
struct fd* stdin = &open_fds[0];
stdin->exists = true;
stdin->position = 0;
stdin->internal = null;
stdin->fops = &terminal_fops;
struct fd* stdout = &open_fds[1];
stdout->exists = true;
stdout->position = 0;
stdout->internal = null;
stdout->fops = &terminal_fops;
struct fd* stderr = &open_fds[2];
stderr->exists = true;
stderr->position = 0;
stderr->internal = null;
stderr->fops = &terminal_fops;
}
This way, when a program calls read or write on these file descriptors, the kernel can route them to the
terminal driver, which then reads from the keyboard buffer or writes to the screen.
Fleshed out driver support
Right now, you see that we have a terminal driver that handles keyboard and screen I/O. However, this isn't an actual driver subsystem. We can define drivers as modules that do the following:A device driver is software that operates or controls a particular type of device that is attached to a computer.[1] A driver provides a software interface to hardware devices, enabling other software to access hardware functions without needing to know precise details about the hardware. - WikipediaWe... don't have that. The terminal driver is not a driver, but rather, a set of functions that determine how to run I/O. We need some way to identify devices, load their drivers, and have a way for programs to interact with the active device driver.
With needing a way to enumerate devices, we need a way to keep devices in memory. We can represent a device with this struct:
typedef struct Device {
// These are kernel-level identifiers
// these don't mean anything for identifying
// the actual device, but rather to keep track of
// a device throughout functions.
char* name;
int type;
int conn;
u32 id;
u32 size;
bool owned;
} Device;
So we know the type of device, how it's connected, and it's internal ID. Size just helps for knowing how big
the struct actually is, since we can use that to add more fields depending on whether the Device is a PCI device
(which would have more metadata, look at the repo to see the abstraction).
We can then define a driver struct very similarly: drivers need to have certain functions implemented no matter their type, so we can define this abstracted struct below:
// This is an active instance of a driver.
typedef struct AstatineDriver {
u32 driver_type;
u32 device_type;
// if not null, then driver is initialised
Device* device;
struct KernelFunctionPointers* kfp;
// Some drivers will want to discover their own devices
// like ISA drivers that don't use plug-and-play.
// or other legacy hardware like the pcspk.
bool (*probe)(Device* device, struct KernelFunctionPointers* kfp);
// Other drivers will rather just get the list of devices
// and look for ones that they can manage.
bool (*check)(Device* device, struct KernelFunctionPointers* kfp);
// Assuming a device that this can manage has been found,
// it means we can add that device to the global list,
// and initialise the driver for that device.
int (*init) (struct AstatineDriver* self);
// If the driver is deemed unnecessary, the deinit function
// will release the device.
void (*deinit)(struct AstatineDriver* self);
// Other functions are determined by the driver handler
// so teletype-drivers will have the ability to draw characters,
// display drivers can set pixels, etc.
} AstatineDriver;
The one issue with the current driver system is that there are two identical driver structures: one for
currently-active drivers that have been instantiated and attached to a device, and one for driver blueprints
that can be loaded from disk. This is something that should be optimised in the future.
Now, we have both a driver struct and a device struct. By "discovering" ISA devices like the VGA text mode that is always available in 32-bit BIOS systems, we can instantiate a teletype driver for this VGA text mode device. This is very similar to how Linux handles drivers, where both drivers and devices are registered, and if a matching pair is found, instantiation is attempted.
ASTATINE_DRIVER(TeletypeDriverFile) = {
.base = {
.sig = "ASTATINE",
// teletype item
.driver_type = CONNECTION_TYPE_IO,
// i have to be another level of stupid
.device_type = DEVICE_TYPE_TTYPE,
.name = "VGA BIOS Text-mode Driver",
.version = "0.1",
.author = "Adithiya Venkatakrishnan",
.description = "BIOS VGA text-mode driver (80x25).",
.reserved = {0},
.verification = {0xEF, 0xBE, 0xAD, 0xDE, 0xEF, 0xBE, 0xAD, 0xDE},
.probe = null,
.check = check,
.init = init,
.deinit = deinit,
},
.functions = {
.get_mode = get_mode,
.set_char = set_char,
.get_char = get_char,
.clear_screen = clear_screen,
.set_cursor_position = set_cursor_position,
}
};
(If you're wondering what the "stupid" comment is about, I mixed up the device type and driver type MULTIPLE
times while writing this code...)
Since our teletype driver assumes the existance of a VGA text mode device, we don't need to implement the probe function. If the check function, which checks to see if the ISA device is the correct type, returns true, our init function is called, which then initialises the driver for that device.
The driver file is loaded into memory from disk, but because the functions for each driver rely on passing the driver struct around, these functions have no internal state, so we can load a single instance of the driver file for a device, and load as many driver instances as we need for each device. This will be more useful when USB drivers are implemented.
Once we have an active teletype driver installed, we can set our kernel-mode print functions to use the device specified by the driver. We can also set our terminal syscalls to use the teletype driver as well, allowing user-space programs to write to the screen and read from the keyboard.
What was the issue with the ELF loader???
I mentioned an issue with the ELF loader earlier, so let's get into that. The issue was intermittent, occasionally drivers would not load properly, and would pagefault the system. This was super annoying to debug, because the issue was in the kernel and not a debuggable user-space program. I had to check where the GP fault or Page fault happened, look at the kernel map to see which function caused it (spoiler: it was always the same function), and then try to fix it.The issue was that the variables initialised on the stack were not zeroed out before use. That's all I have to say.
Our teletype driver now works perfectly. We can read the teletype driver off the disk, load it into memory, instantiate it using the ELF loader, create a new instance of a TeletypeDriver that attaches to VGA text mode, then set our terminal syscalls to use that driver. This is exactly what I wanted, and now user-space programs can work better.
The final touches
I haven't quite gotten to v0.0.5-alpha.2 yet. I also wrote the infrastructure for a disk driver. Of course, rather than loading the disk driver from disk (catch 22), I hardcoded it with the kernel. This is stupid, but can be fixed later as we can most likely include a basic disk driver in our ABP, and depending on the disk we have, we can load it from there.Also, if you look at the repo, you'll see a "generic" driver that only has the basic AstatineDriver functions and nothing else. This is the case for most controller devices, as items like the IDE disk controller appear via PCI but don't have any specific functions that need to be implemented for special use. Of course, they can have configuration, but this configuration can largely be done through standard PCI config space, and what is most important is that the controller driver probs and enumerates the IDE disks in the system.
Burn-out v2
Now, after all this, my brain hurts from staring and trying to debug C code. I have no interest in coding for the time being, considering the current status of my gradebook. I'll probably come back to this project during my 2026 summer, but for now, this is the last of what I've got.Linus Torvalds really was right, I had no idea what I was getting myself into. At least, at the very end of the day, I will have created something that resembles something of my own, which is always the fun part of coding, besides the actual journey.
