ggx: Hello World!
[Patch 19 in a series]
This is it. Today we're building and running a "Hello World" C app on the ggx simulator.
Yesterday we identified nine missing instructions required to link hello.c to newlib. They were all arithmetic and logical operators and were simple to implement in all of the tools.
Today's patch also includes a software interrupt instruction (swi). This is the instruction that ggx code will use to talk to the simulator in order to interface with the outside world. Consider, for instance, the primitive function "write"
int write (int fd, const void *buf, size_t len); We implement a system call in libgloss like so:
/*
* Input:
* $r0 -- File descriptor.
* $r1 -- String to be printed.
* -8($fp) -- Length of the string.
*
* Output:
* $r0 -- Length written or -1.
* errno -- Set if an error
*/
write:
swi SYS_write /* SYS_write is a constant 5 */
retThen the simulator's handling of the swi instruction includes a switch on the interrupt number:
case 0x5: /* SYS_write */
{
char *str = &memory[cpu.asregs.regs[3]];
/* String length is at 0x8($fp) */
unsigned count, len = EXTRACT_WORD(&memory[cpu.asregs.regs[0] + 8]);
count = len;
while (count-- > 0)
putchar (*(str++));
cpu.asregs.regs[2] = len;
}It's not a perfect implementation (all output goes to stdout, and there's no error checking), but it works for now.
Today's patch and tests really put the toolchain through its paces, as we'll be simulating thousands of instructions just for hello.c! Our simulator runs to date have been limited to a dozen or so instructions, so this is a big jump...
$ cat hello.c
#include <stdio.h>
int main()
{
puts ("Hello World!");
return 0;
}
$ ggx-elf-gcc -o hello hello.c
$ ggx-elf-run hello
Hello World
$ ggx-elf-run -v hello
ggx-elf-run hello
Hello World!
# instructions executed 2704 2704 instructions is a lot of code for just Hello World. Consider, however, that this includes all of the system initialization code, such as initializing the heap, allocating IO buffers with malloc(), etc. There's a lot that has to happen before we get to printing our greeting.
To be honest, I skipped a step in there. It's the one where running "hello" fails in many interesting ways and you have to debug simulator traces. There's no avoiding this step. Just think of it as a rite of passage.
In my case, I tracked it down to bad relocation generation in the assembler and was able to fix it thanks again to #gcc IRC folks. Today's patch includes this fix.
When I started this series I wrote that I would show how go from nothing to running real programs on a new architecture by posting daily patches. I think we're there, so I'm going to slow down the blogging effort. But that's not to say that I'm done with ggx. There's still plenty to do. Here are some ideas for work items:
gcc testsuite
GCC comes with a huge suite of tests to exercise code generation. The next obvious step in the development of this toolchain is to start whacking away at bugs identified by the testsuite. I've had a quick crack at it, and a couple of obvious things need fixing. For instance, there's an off-by-one error in ABI handling for vararg functions. There's also no support for trampolines, which are needed for GCC's nested functions extension. Apart from that, the results look pretty good so far.
broader language support
Exception handling is big issue here. The compiler needs to be taught how to deal with C++ and Java exceptions. libffi is also needed for Java.
gdb support
This is a tricky one for ggx. If we were working from a pre-defined ISA, then I would go straight to gdb next. Stepping through code with gdb is much more productive than reading instruction traces from the simulator. But we're not working from a pre-defined ISA with fixed instruction encodings. We're just at a first draft of the ISA and plan to make many changes.
Unfortunately, it looks like a lot of what goes into making gdb work is hard coding recognition of instruction sequences for things like like function prologues. I don't want to have to hack gdb every time I tweak the ISA, so I'll leave this 'til much later in the game.
ISA tweaking
This is where the game really begins. Check out this chart, which shows the static frequency of instruction types used in our hello application:

The most frequently used instruction type is GGX_F1_A4, which is a 16-bit instruction with one operand followed by a 32-bit value. These are all of the "load immediate" and "load absolute" instructions. What we'll want to do is understand everything about these instructions: how and why they're used, and if there's anything we can do to eliminate them or encode them more efficiently. We already know there's 6-bits of waste in the first 16-bits of GGX_F1_A4 because we're not using operands B and C. Perhaps we can use those 6-bits to hold a small constant value. That would turn some number of those 48-bit load-immediate instructions into 16-bit instructions. There will be many opportunities for improvements like this, and it should be interesting to see how dense we can make our code.
Run Linux!
Only half joking here. Truth be told, it builds, but we're a long way from booting...

It's interesting to see that the mix of instructions is completely different in the kernel. The 4th and 5th most frequently used instruction types are swapped. This demonstrates how it will be important to have a good mix of programs at hand while we're tweaking the ISA.
I hope some people have found this series interesting. I'm interested in hearing feedback if you are so inspired.
As usual, today's patch is available in the ggx patch archive.


US Airways' mechanical problems forced me to stay by SFO one extra night. All that airport wait time gave me a chance to finally tackle stdarg support head on. And for this, I decided to change the ggx ABI one more time. If you recall, the ggx ABI
Tracked: Apr 13, 08:16
The ggxdev MiBench harness is working pretty well now ("ant benchmark" will run them). I added my first new instructions based on benchmark results: inc and dec. Prior to inc and dec, you would often see code like this: ldi.l $r1, 1 add.l $r0
Tracked: Sep 10, 08:18