Moved site to hugo

This commit is contained in:
Drew Galbraith 2023-12-06 16:38:11 -08:00
parent 1339d09535
commit cd8be31924
49 changed files with 243 additions and 615 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 136 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

View file

@ -0,0 +1,249 @@
---
title: "Writing a Sudoku Solver: Displaying the Grid"
date: 2023-04-10
draft: true
---
I previously dabbled with writing a [sudoku solver](https://gitlab.com/dgalbraith33/sudoku-solver)
but got carried away early on by making crazy speed improvements rather than actually improving the
solving ability. What I did enjoy about the project is making logical deductions rather than the
guess and backcheck method commonly employed.
I want to take another crack at doing this except this time focus on the question "Given the current
state what are any of the next possible deductions" without focusing on speed. Eventually I'd also
like to take a crack at solving other types of sudokus (Chess Sudoku, Killer Sudoku, etc).
One of the issues I had last time as I was debugging was displaying the current board state in
a clean way so I could see what had gone wrong. So this time around I'm planning on writing a quick
HTML/javascript board state display to be able to visualize the board. For now I don't intend it to
be interactive, however I'll likely add that in the future.
## Creating a sudoku grid
I'm no front end dev so this may take some time. I literally just want to create an outline for the
puzzle with nothing in it.
```html
<div class="container">
<h1>Sudoku</h1>
<div class="puzzle">nothing</div>
</div>
```
```css
.container {
max-width: 800px;
margin: auto;
}
.puzzle {
border: 1px solid black;
}
```
{{< figure src="images/sudoku-1.png" alt="A box with nothing in it."
class="center-img" >}}
I'm not joking when I say I'm going to have to do baby steps here.
Next I'll try to actually make a square grid. The best way to do this is probably with flexbox or
something but I'm just gonna hard code some widths and heights.
Ok lets make this thing a width divisible by 9 so we can divide it into equal portions. I chose 630
quite honestly because it was the first number below 800px that popped into my head divisible by 9.
Let's focus on the 9 main boxes in the grid now before worrying about the cells. I'll make each box
210 pixels tall and wide. And slap a big ole border on there. Let's make it 2px because we will
want a narrower one for the cells.
```css
.puzzle {
width: 630px;
height: 630px;
}
.box {
border: 2px solid black;
width: 210px;
height: 210px;
}
```
{{< figure src="images/sudoku-2.png" alt="Boxes stacked vertically."
class="center-img" >}}
Reload that and... right div's auto linebreak after them.
I think we can "float: left" these bad boys and...
{{< figure src="images/sudoku-3.png" alt="Boxes stacked vertically in pairs."
class="center-img" >}}
Right, now I'm pretty sure the boxes are 210 + 4 pixels wide because the border
isn't included. While I'm tempted to just math my way out of this I recall that
you can specify the border-box sizing to avoid this.
Now this works! Now the astute of you may have noticed that there were 10 not 9 boxes in the
screenshot with the 2 columns. That became even more obvious in the full grid.
{{< figure src="images/sudoku-4.png" alt="A 3x3 grid of boxes with one extra below it."
class="center-img" >}}
Ok now we can just recreate all of this with the cells and should be good to go right?
Nope! The internal size of the boxes are now only 206x206 because of the
border-box attribute. But I now realize I can just get rid of the puzzle sizing
all together and go back to regular sizing on the boxes. This happens to work
because 4 214px boxes won't fit in the 800px wide container. (Again just use
flexbox).
Finally this works!
{{< figure src="images/sudoku-5.png" alt="A 3x3 grid of boxes." class="center-img" >}}
## Displaying a puzzle
Next up is to actually get some numbers in this bad boy. Now this is where I relent and use
flexboxes because
[this](https://stackoverflow.com/questions/2939914/how-do-i-vertically-align-text-in-a-div/13515693)
StackOverflow answer convinced me.
```css
.cell {
...
font-size: 40px;
font-weight: bold;
display: flex;
align-items: center;
justify-content: center;
}
```
{{< figure src="images/sudoku-6.png" alt="A box with 9 cells in it." class="center-img" >}}
### Taking the puzzle from the URL
Now to make it so I can get the page to display any puzzle I want easily from the solver, I'll allow
specifying it as a parameter in the URL. For now in a row-major string of 81 characters using a
period to denote blank spaces.
I'll can just get all of the cells by class name and iterate over them in the same order as the
puzzle string and it will display relatively easily.
```js
window.onload = (event) => {
var params = new URLSearchParams(window.location.search);
var puzzle = params.get("p");
if (puzzle === null) {
return;
}
if (puzzle.length != 81) {
console.log("Failure: puzzle url len is " + puzzle.length);
return;
}
var cells = document.getElementsByClassName("cell");
if (cells.length != 81) {
console.log("Failure: wrong number of cells: " + cells.length);
return;
}
for (i = 0; i < 81; i++) {
if (puzzle[i] != '.') {
cells[i].innerText = puzzle[i]
}
}
}
```
And hooray it works super well!
{{< figure src="images/sudoku-7.png" alt="A full sudoku puzzle!" class="center-img" >}}
Oh wait... I didn't think about the fact that the elements in the HTMLCollection from the
document.getElementsByClassName call wouldn't be in the row order of the puzzle (all of the cells
in box 1 come first).
You can see the effect of this as there are 2 9s in column 2 of the puzzle. Oops.
I'm going to just do the old fashioned brute force way and give each cell an id from 1-81 and insert
those manually. I'm sure there is a better way but hey this works.
Then with a quick update to the code.
```js
for (i = 0; i < 81; i++) {
if (puzzle[i] != '.') {
document.getElementById(i+1).innerText = puzzle[i]
}
}
```
It works!
{{< figure src="images/sudoku-8.png" alt="A full sudoku puzzle for real this time!" class="center-img" >}}
Still some goofiness like the border on the outside being thinner than the interiors but I'm pretty
happy with this for now.
## Showing pencil marks
Now to really visualize the solver's state, we'll also need to see which pencil marks it has.
These could be styled nicely but for now I just went with a span inside the cell div with the
following style.
```css
.cell > span {
font-size: 10px;
font-weight: normal;
text-align: center;
letter-spacing: 6px;
word-wrap: anywhere;
}
```
The letter-spacing and wordwrap attributes let us just jam all of the pencil marks in as a single
string and let the browser break them up into multiple lines for us.
This comes out quite nicely:
{{< figure src="images/sudoku-9.png" alt="The top boxes of a sudoku puzzle with pencil marks"
class="center-img" >}}
### Reading the pencil marks from the url
For this url trick we will add the pencil marks using a comma separated array like so:
123,,,,456,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Using the string above we can use the following javascript code to parse and insert them:
```js
var marks_param = params.get("m");
if (marks_param === null) {
return;
}
var marks = marks_param.split(",");
if (marks.length != 81) {
console.log("Failure: marks url len is " + marks.length);
return;
}
for (i = 0; i < 81; i++) {
if (marks[i].length > 0) {
var cell = document.getElementById(i+1);
if (cell.innerHTML.trim().length > 0) {
console.log("Pencil marks in cell with number: " + (i+1));
} else {
cell.innerHTML = "<span>" + marks[i] + "</span>";
}
}
}
```
Which also comes out nicely:
{{< figure src="images/sudoku-10.png" alt="The top boxes of a sudoku puzzle with pencil marks"
class="center-img" >}}

0
content/blog/_index.md Normal file
View file

View file

@ -0,0 +1,311 @@
+++
title = 'Acadia 0.1.0'
date = 2023-12-06
draft = true
tags = ['osdev']
+++
For the last six months or so I've been periodically working on developing a
hobby operating system. A couple weeks ago I decided that I should finally aim
to cut a "release." This very-early release doesn't include a bunch of user
functionality but does have a fair amount of kernel features.
Namely you can navigate a filesystem in a primitive manner and
execute binaries. The following image shows just about everything the OS can do.
(The black window is the OS running in QEMU and the larger gray window is debug
output sent to COM1).
![AcadiaOS in action](/images/blog/acadiaos-0.1.0.png)
While there isn't much to do as a user, there are a lot of building blocks there
that I spent the last 6 months learning about and working on.
## What I knew going into this
Frankly, not a lot.
I took an OS class in college, but while it covered OS fundamentals the projects
were based on writing modules for the Linux kernel rather than working on our
own barebones kernel and OS. So while I vaguely knew of how things like process
scheduling, interrupts, and memory management worked, I had no experience
getting down to the brass tacks of how to actually implement these things.
I had over the previous couple years spent some time writing a small kernel to
start learning some of these things. However, since I used it as a testing
ground for learning with no real design goals or long term plan, it was kind of
a mess. I had gotten to user space with some primitive syscalls but it was
memory issues and page faults galore. So I decided to "reboot" things earlier
this year.
## Design Goals
I decided I wanted to write a microkernel based OS because I figured the more of
my messy code I can move to user space the better. And also because that's what
OS nerds do. I'm not too concerned about the performance cost of extra syscalls
because by god this thing isn't gonna be too performant anyways.
Additionally, I wanted to try to make the system capability-based. Trying a new
permission model was appealing to me because I've always felt the unix style one
was a bit clunky. After spending some time reading about seL4 and digging into
the Zircon interface I had a (very) rough idea of how these systems worked. I
have no illusions that my OS will every be "secure" but I find the model
interesting.
## References and Resources
Over the course of this project I used a lot of resources, not least of which
the OSDev.org [wiki](https://wiki.osdev.org) and
[forums](https://forum.osdev.org). The resources provided there were invaluable,
but the biggest lesson I learned since my first time around writing a kernel was
to rely on specs more than other's code samples and tutorials.
For the low-level stuff I spent a lot of time digging through Intel and AMD's
monstrous programming manuals. It was helpful to use the wiki to learn for
instance that using the "iret" instruction is a good way to jump to user-space
for the first time, but from there using the programming manuals to understand
exactly how that instruction works rather than just copying code from somewhere.
I had a similar experience with initializing the GDT in 64 bit software. There
are a lot of random claims out there on exactly how you have to set it up, so it
was much more efficient to just go dig through the AMD64 spec however dry it may
be.
As I worked my way up the stack, I used the SATA and AHCI specs as well. They
pose the additional complication of splitting things up across multiple specs so
you have to go back and forth a lot in non-obvious ways. Hey at least they don't
try to charge you thousands of dollars to get the spec like PCI.
I also found that when you needed examples of how to do something specific it
can be far better to look at an existing operating system's approach to help
contextualize a specification. Andreas Kling's SerenityOS was invaluable for
this for some low level x86 things. I also referenced the Zircon microkernel to
figure out how to use C++ templates to downcast capability pointers to their
specific objects types without relying on RTTI (run time type information).
## Kernel Implementation Details
Ok enough about high level information, ambitions, and goals. Let's discuss a
little bit more about what the actual system can do at this point. I named the
kernel Zion because it is another place I love and it is also kind of fun to
think of the operating system as everything from (A)cadia down to (Z)ion.
This section will frequently reference the source code which is available on my
self-hosted [gitea](https://gitea.tiramisu.one) or mirrored to
[GitHub](https://github.com/dgalbraith33/acadia).
### Low-level x86-64 stuff
Because I found setting up paging, the higher half kernel, and getting to long
mode to be a pain the first time around, I decided to use the [limine
bootloader](https://github.com/limine-bootloader/limine) to start the kernel
this time around instead of GRUB so I could focus on slightly higher level
things. I have ambitions to make the kernel more bootloader-agnostic in the
future but for now it is tightly coupled to the limine protocol.
On top of the things mentioned above, we use the limine protocol to:
* Get a map of physical memory.
* Set up a higher-half direct map of memory.
* Find the RDSP.
* Get a VGA framebuffer from UEFI.
* Load the 3 init programs that are needed to bootstrap the VFS.
Following boot we immediately initialize the global descriptor table (GDT) and
interrupt descriptor table (IDT). The **GDT** is mostly irrelevant for x86-64,
however it was interesting trying to get it to work with the sysret function
which expects two copies of the user-space segment descriptors to allow returing
to 32bit code from a 64 bit OS. Right now the system doesn't support 32 bit code
(and likely never will) so we just duplicate the 64 bit code segment.
The **IDT** is fairly straightforward and barebones for now. I slowly add more
debugging information to faults as I run into them and it is useful. One of the
biggest improvements was setting up a seperate kernel stack for Page Faults and
General Protection Faults. That way if I broke memory related to the current
stack frame I get useful debugging information rather than an immediate triple
fault. I also recently added some very sloppy stack unwind code so I can more
easily find the context that the fault occurred in.
Finally we also initialize the **APIC** in a rudimentary fashion. The timer is
used to trigger scheduling events and we map PCI and PS/2 Keyboard interrupts to
appropriate vectors in the IDT.
### Memory management
Memory management seems to be one of those areas where every time I make
progress on something I discover about 4 more things I'll have to do down the
line. I'm somewhat happy with the progress I've made so far but I still have a
lot to read up on and learn - especially relating to caching policies for mapped
pages.
For **physical memory management** I maintain the available memory regions in
two separate linked lists. One list contains single pages for when those are
requested, the other contains the large memory regions which are populated
during initialization. This design allows us to easily reuse freed pages (using
the list of small pages) while still efficiently finding large blocks for things
like memory mapped IO (using the list of large pages).
The one catch is that to build these linked lists we need an available heap. And
to have an available heap we need to be able to allocate a physical memory
region for it (and its necessary paging structures). To accommodate this, we
initialize a temporary physical memory manager that just takes a hardcoded
number of pages from the first memory region and doles them out in sequence.
Right now I hardcode the number of necessary pages to exactly the number it
needs. This means if I change something that causes more pages to be allocated
earlier than they need to be it is obvious because things break.
For **virtual memory management** I keep the higher half (kernel) mappings
identical in each address space. Most of the kernel mappings are already
availble from the bootloader but some are added for heaps and additional stacks.
For user memory we maintain a tree of the mapped in objects to ensure that none
intersect. Right now the tree is innefficient because it doesn't self balance
and most objects are inserted in ascending order (i.e. it is essentially a
linked list).
For user space memory structures we wait until the memory is accessed and
generates a page fault to actually map it in. In order to map it in we check
each paging structure in the higher-half direct map (rather than using a
recursive page structure) to ensure it exists, allocating a page table if
necessary. All physical pages used for paging structures are freed when the
process exits.
For **kernel heap management** I wrote a
[slab-allocator](https://en.wikipedia.org/wiki/Slab_allocation) for relatively
small allocations (up to 128 bytes currently). I plan on raising the limit for
that as well as adding a buddy allocator for larger allocations in the future
but for now there is no need - all of the allocations are 128 bytes or less!
Larger allocations for now are done using a linear allocator.
### Scheduling
Right now the scheduling process is very straight forward. Each runnable thread
is kept in an intrusive linked list and scheduled for a single time slice in a
round robin fashion.
Thread can block on other threads, semaphores, or mutexes. When this happens
they are flagged as blocked and moved to an intrusive linked list on that object
which is responsible for scheduling those threads once the relevant state
changes.
The context switching code simply dumps all of the registers onto the stack and
then writes the stack pointer into the thread structure. It also writes the SSE
registers to an allocated space on the thread structure. I believe this code
could be made more efficient by only pushing callee-saved registers and using
the x86 feature that allows you to lazily save the SSE registers only once they
are used. However for now I prefer this code be more reliable than efficient
(because it scares me and is a PITA to debug).
Finally, there are definitely critical sections in the kernel code that are not
mutex protected currently. It is on the TODO list to do a good audit of this in
preparation for SMP (AcadiaOS 0.2 anyone?).
### Interface
Most system calls the kernel provides either (a) create and return a capability
or (b) operate on an existing capability. Capabilities can be duplicated and/or
transmitted to other processes using IPC.
For syscalls that operate on an existing capability, the kernel checks that the
capability exists, that it is of the correct type, and that the caller has the
correct permissions on it. Only then does it act on the request.
The kernel provides APIs to:
* Manage processes and threads.
* Synchronizes threads using mutexes and semaphores.
* Allocate memory and map it into an address space.
* Communicate with other processes using Endpoints, Ports, and Channels.
* Register IRQ handlers.
* Manage Capabilites.
* Print debug information to the VM output.
### IPC
Interprocess communication can be done using Endpoints, Ports, or Channels.
**Endpoints** are like servers that can be called and provide a response. For
each call a "ReplyPort" capability is generated that the caller can wait for a
response on and the server can send its response to. **Ports** are simply
one-way streams of messages that don't expect a response. Example uses are for
process initialization information or for IRQ handlers. **Channels** are
for bidirectional message passing that I haven't found a use for and will
probably replace in the future with a byte-stream interface.
Message that are passed on these interfaces consist of two parts: a byte array,
and an array of capabilities. Each capability passed is removed from the
existing process and passed along to whichever process receives the request.
I'm fairly happy with these interfaces so far and was able to build a user-space
IDL (Yunq) on top of them to facilitate message and capability passing. However,
I'm concerned about their ability to handle certain concerns. For instance,
since endpoints aren't "owned" by a specific process, it is impossible to tell
if you are "shouting into the void" at a process that has crashed or isn't
listening to the specific endpoint anymore.
## User Space Programs
There are a few user-space programs that are run on the system:
* **Yellowstone**: The init process that starts all others and maintains a
registry of endpoints. (Because Yellowstone was first).
* **Denali**: A basic AHCI driver to read from disk. (D for disk).
* **VictoriaFallS**: A VFS server with a super simple read-only ext2
implementation. (I couldn't resist because it has VFS in it).
* **Teton**: A terminal application with a lightweight shell in it (should
eventually be split). (T for terminal).
* **Voyageurs**: PS/2 Keyboard driver with the intent of becoming the USB
driver. (Idk bytes traveling over USB are making a voyage I guess).
These programs are all bare-bones versions of what they could be in the future.
I hope to describe them in further detail in the future, but for now the
initialization process works like this.
1. Yellowstone, Denali, and VictoriaFallS binaries are loaded into memory as
modules by the bootloader.
2. The kernel loads and starts the Yellowstone process, passing it memory
capabilities to the Denali and VictoriaFallS binaries.
3. Yellowstone starts Denali and waits for it to register itself.
4. Yellowstone reads the GPT and then starts VictoriaFallS on the correct
partition and waits for it to register itself.
5. Yellowstone then reads the /init.txt file from the disk and starts each
process specified (one per line) in succession.
## Yunq IDL
As I began writing system services, I found a huge speed bump was creating
client and server classes for the service. I started by just passing structs as
a byte array and hardcoding whether or not the process expected to receive a
capability with the call. This approach worked but was painful and led to me
dreading each new service I added to the system (not how it should be for a
microkernel architecture!). Additionally I did things like avoiding repeated
fields or strings fields that weren't possible to pass in a single struct.
It was clear I needed some sort of IDL to handle this, but for months I waffled
on it as I tried to figure out how to incorporate an existing one into the
system. That didn't work for two reasons. First, we need a way to pass
capabilities with the messages. These kind of need to be sidechanneled because
the kernel can't just treat them as another string of bytes (they have to be
moved into the other processes capability space). Second, existing serialization
libraries tend to have dependencies, so porting them would require porting those
dependencies first. Granted, some of them just require super basic things like
say a libc implementation - but we don't even have that yet. All that to say I
ended up writing my own.
I was pleasantly surprised with how straightforward it ended up being. I think
it took me about 3 coding sessions to get the basic parsing and codegen going
for the language. It still doesn't have all of the features I planned for it
(like nested messages), but it works super well for setting up new services
quickly and easily. Currently the implementation is in python because I wanted
to get something working quickly, but I'll probably reimplement it in a compiled
language in the future with a focus on better error information.
## Closing thoughts
Overall, I'm very pleased with how this project has turned out. I feel like I've
definitely accomplished my goal to learn more about how operating systems are
actually implemented. It has been cool to be able to pull back the curtain and
see some of the simple primitives that underlay the complex features of an
operating system.
I aim to continue forward with this project - without throwing out the code
again as I did earlier this year. I'm happy with the base and look to iterate on
it, hopefully building something more useful in the future but definitely
learning more along the way.