Hello "Hello world!"
Aug 15, 2020Languages are often judged initially on their "Hello, world!" program. How easy is it to write? To run? How easy is it to understand? It's a very simple program, of course, one of the simplest, even... just produce a little text, and display it, what could be simpler?
It's really not fair to judge a language by such a cursory impression, but it can give you an idea of what a language values and how it works. What does the syntax look like? Is it typed? Is it interpreted? You can usually tell a lot at a glance.
For example, One of Ruby's (many) hello worlds is so simple, it's also Python!
print('Hello world!')
Often, people coming from interpreted languages experience compiled, systems languages to be more complicated right off the bat. There is the obvious added complexity of compiling and running being separate steps, as opposed to simply pointing an executable at some source code and seeing a result right away, but there are often syntactical constructs to go along with that...
At first glance, Rust's hello world looks fairly inert, as well:
fn main() {
println!("Hello World!");
}
But println!
is actually a macro, what does it look like expanded?
macro_rules! println {
() => ($crate::print!("\n"));
($($arg:tt)*) => ({
$crate::io::_print($crate::format_args_nl!($($arg)*));
})
}
You'll notice that this also has a macro inside of it. This matches on the
second case (because there is an argument present) and calls into
$crate::format_args_nl!
and passes the result of that to $crate::io::_print
pub fn _print(args: fmt::Arguments<'_>) {
print_to(args, &LOCAL_STDOUT, stdout, "stdout");
}
print_to
which looks like
fn print_to<T>(
args: fmt::Arguments<'_>,
local_s: &'static LocalKey<RefCell<Option<Box<dyn Write + Send>>>>,
global_s: fn() -> T,
label: &str,
) where
T: Write,
{
let result = local_s
.try_with(|s| {
// Note that we completely remove a local sink to write to in case
// our printing recursively panics/prints, so the recursive
// panic/print goes to the global sink instead of our local sink.
let prev = s.borrow_mut().take();
if let Some(mut w) = prev {
let result = w.write_fmt(args);
*s.borrow_mut() = Some(w);
return result;
}
global_s().write_fmt(args)
})
.unwrap_or_else(|_| global_s().write_fmt(args));
if let Err(e) = result {
panic!("failed printing to {}: {}", label, e);
}
}
Which is, uh, well let's just say it's not exactly simple looking now? There is a lot going on here!
To be clear, I'm not faulting Rust here at all, my point is exactly the
opposite actually, in that there is always necessarily more going on in a
"Hello world!" than puts "la de da"
or similar would have you believe on its
face. Speaking of Ruby's puts
, what is the code that runs puts
in the
Ruby interpreter itself, which is written in C?
Well it looks like this
VALUE
rb_io_puts(int argc, const VALUE *argv, VALUE out)
{
int i, n;
VALUE line, args[2];
/* if no argument given, print newline. */
if (argc == 0) {
rb_io_write(out, rb_default_rs);
return Qnil;
}
for (i=0; i<argc; i++) {
if (RB_TYPE_P(argv[i], T_STRING)) {
line = argv[i];
goto string;
}
if (rb_exec_recursive(io_puts_ary, argv[i], out)) {
continue;
}
line = rb_obj_as_string(argv[i]);
string:
n = 0;
args[n++] = line;
if (RSTRING_LEN(line) == 0 ||
!rb_str_end_with_asciichar(line, '\n')) {
args[n++] = rb_default_rs;
}
rb_io_writev(out, n, args);
}
return Qnil;
}
Hello world!
We all know that a languages like Ruby or Python are designed explicitly to hide this sort of complexity from us and let us get on with the dirty business of munging data blobs or serving web requests or solving sudokus or whatever, and thank goodness for that, but wow that is quite a lot, isn't it?
When people come from languages that were designed to be ergonomic to more systems oriented languages, they're often jarred by what they perceive to be code thatis inelegant, ugly, and verbose. To be sure, it sometimes is exactly that... (although anyone who has worked with a "pretty" language in a production codebase knows that those are not immune to these descriptors either).
Usually, the tradeoff is explicit: elegance and simplicity for control... specific and granular control, over the program that will eventually be run. It isn't always necessary, in fact it is almost always _un_necessary, to have that much control over your program. Obviously, productivity matters, and if your business is insert viable business, well it's likely that your goals are not going to be optimally met by futzing with manual memory management all day (at least from the macro level, in the general sense).
But what if you do need that control? Well then, you need it. When every ounce of performance actually is necessary, or on embedded systems with hard memory constraints, or when writing code for some bespoke or otherwise uncommon processor.
I'm going to choose one language, Zig, and dive deep into its hello world, but it is important to note here that my point is not primarily about Zig, it's about how all languages have to contend with an enormous amount of complexity in order to do anything, even the simplest of tasks like a hello world program. Complexity that is, for the most part, hidden from us in our day to day. So what in the hello world is actually going on then?
I'll be using the most current minor release version of Zig: 0.6.0.
Let's take a walk
Zig's hello world looks like this, from the docs.:
const std = @import("std");
pub fn main() !void {
const stdout = std.io.getStdOut().outStream();
try stdout.print("Hello, {}!\n", .{"world"});
}
If you are new to zig, a quick word on this syntax before I get into the gritty details.
const std = @import("std");
@import
is a compiler
builtin function
that assigns the namespace of the file it is referencing to the const
variable on the left hand side.
pub fn main() !void {
//...
}
Just like in C, main
is a special function that marks the entry point to a
program after it has been compiled as an executable. Unlike in C, it accepts no
arguments (C's main function has a variety of vagaries that make it a bit
unique) and command line input is
available through utility functions to allow easier cross platform use.
It is marked pub
so
that it is accessible from outside of the immediate module ('module' here
referring to nothing more than the top level scope of the current namespace...
i.e., the file), this is a necessary step since, as the program's entry point,
main
would have to be accessible from outside the immediate scope.
fn
is the function keyword.
main()
is the name of the function (and where the argument list would be)
and !void
is the return type. Looking a little closer at that return type:
In C, the return type of a function is declared before anything else. This makes a certain amount of sense: it's congruent with how variables are declared, after all, and scanning the file you can see clearly "calling this will get you that."
In Zig, the return type comes after the function declaration but before the function body. This also makes sense! It's the same in Rust and Go, and seems to be generally a more modern approach. The reason is actually pretty simple: doing it this way makes it possible to have a context-free grammar! C and C++ put the parser in a position where it has to understand semantics to even just parse the source code.
In Zig, main
returns void
(well, actually, it can return a variety of
things, and if it returns void (which is just a way of saying it doesn't return
anything at all)), it's actually returning
0
as a success code, but) there is a wrinkle! void
is preceded by an
exclamation mark. This means: "This function is supposed to return void
, but
it could fail and return an error." This is an inferred error
set, and
whenever a function that could fail is called, the compiler will enforce that
you handle that error at the call site. More on Zig's error handling some other
time, for now it is enough to understand what the !
in front of the return
type declaration means. I want to move on to the body of the function, let's go
line by line.
const stdout = std.io.getStdOut().outStream();
So, we can see that this is a call into a standard library function (std
)
that returns something that we assign to const stdout
. Standard out (stdout) and
standard error (stderr) may be familiar concepts from the shell, but what does it
mean to be referring to stdout
here in this program? What exactly is
stdout
? Whatever it is, it's being returned by the call to outStream()
,
which is a method called on the return value of std.io.getStdOut()
, so we
first need to know what that is.
To the source! In the Zig source tree, std
lives in lib/std/std.zig
, which
is a file that makes a wide variety of functionality available. It includes the line:
pub const io = @import("io.zig");
Which is referred to on the std
variable as std.io
(again, notice the pub
keyword, without which this declared constant would be inaccesible outside of
this immediate scope). Going deeper, into lib/std/io.zig
...
pub fn getStdOut() File {
return File{
.handle = getStdOutHandle(),
.capable_io_mode = .blocking,
.intended_io_mode = default_mode,
};
}
So, stdout
is a File struct. Let's look at that. It is imported at the top
of io.zig
as
const File = std.fs.File;
and lives in the source, perhaps unsurprisingly, at lib/std/fs/File.zig
. This
struct definition is quite long, so I'll focus on what we want to look at, the
outStream()
method.
An aside: methods vs functions
Zig doesn't really have methods, but it's useful to talk about a special
class of functions as methods, since the calling convention supports implicit
passing of self
when called on a struct "instance" using dot syntax. Let me
show you what I mean.
const std = @import("std");
const Thing = struct {
instanceVariable: u8,
const classVariable = 41;
fn staticMethod(y: u8) u8 {
return classVariable + y;
}
fn instanceMethod(self: Thing) u8 {
return self.instanceVariable;
}
};
pub fn main() !void {
std.debug.warn("{}\n", .{ Thing.staticMethod(1) }); // 42
const thing = Thing{ .instanceVariable = 1 };
std.debug.warn("{}\n", .{ thing.instanceMethod() }); // 1
}
So, despite the lack of explicit classes, these patterns are available
because of support for this calling convention. Treating a struct
like a
class definition, you can call a "static method" on the struct definition
itself. In the example above,
Thing.staticMethod(1);
Is equivalent to the
Thing::staticMethod
syntax in Ruby. In fact, the equivalent example in Ruby looks startingly similar to the Zig version:
class Thing
attr_accessor :instanceVariable
@@classVariable = 41
def initialize(instanceVariable)
@instanceVariable = instanceVariable
end
def self.staticMethod(y)
@@classVariable + y
end
def instanceMethod()
@instanceVariable
end
end
p Thing::staticMethod(1) # 42
thing = Thing.new(1)
p thing.instanceMethod # 1
There are of course notable differences here! Attempting to call a static method on an instance of a class in ruby
p thing.staticMethod 2
will not get you very far
thing.rb:21:in `<main>': undefined method `staticMethod' for #<Thing:0x0000000002284d38 @instanceVariable=1> (NoMethodError)
Likewise, the other way:
p Thing::instanceMethod(1)
thing.rb:18:in `<main>': undefined method `instanceMethod' for Thing:Class (NoMethodError)
Ruby is a full throated object oriented language, and so of course its underlying class abstraction is more robust than this facsimile of one in Zig, but the effect of that is that, well, there's really nothing special about a zig "instance" vs "static" method, as they are simply functions defined on the struct that happen to be available through multiple calling conventions.
Take this again, with the same Thing
struct definition from above:
const thing = Thing{ .instanceVariable = 1 };
std.debug.warn("{}\n", .{ thing.staticMethod() });
You will get a compiler error:
./thing.zig:19:31: error: expected type 'u8', found 'Thing'
std.debug.warn("{}\n", .{ thing.staticMethod() });
But it's telling you that you passed a Thing
to the method. This is the
important point: "instance methods" have special access to "instance variables"
because they have a reference to the struct they are being called on, that's
all. That's all the magic there is here.
Note also that there is nothing special about the word 'self', it is just a conventional variable name.
For completeness, the other direction:
std.debug.warn("{}\n", .{ Thing.instanceMethod() });
You will get what you might expect, given the last example:
./thing.zig:17:51: error: expected 1 arguments, found 0
std.debug.warn("{}\n", .{ Thing.instanceMethod() });
No implicit passing of self
means an arity error on this call.
But, to underscore the fact that there is nothing magical happening here, you can indeed do this:
const thing = Thing{ .instanceVariable = 1 };
std.debug.warn("{}\n", .{ thing.instanceMethod() });
std.debug.warn("{}\n", .{ Thing.instanceMethod(thing) });
Those two calls toinstanceMethod
are the same, but with differing calling
conventions (and so the first one passes self
implicitly!)
The outStream()
"method"
Back in lib/std/fs/File.zig
, we see the definition of this "instance method"
pub fn outStream(file: File) OutStream {
return .{ .context = file };
}
This is returning an OutStream
struct that is initialized with self
of the
File
it was called on (here referred to as file
). Zig supports anonymous
struct
literals
and in this case is able to infer the type based on the return value of the
function. Note too the odd syntax of starting an anonymous struct literal with
.
, which is to syntactically distinguish it from a block.
So, further down again, what is an OutStream
? Its definition is just above:
pub const OutStream = io.OutStream(File, WriteError, write);
Hmm, this is interesting... is this function call returning a... type
definition? That is then assigned to OutStream
and used as a return value for
pub fn outStream
?
That's exactly what it's doing! In lib/std/io/outStream.zig
:
pub fn OutStream(
comptime Context: type,
comptime WriteError: type,
comptime writeFn: fn (context: Context, bytes: []const u8) WriteError!usize,
) type {
return struct {
context: Context,
//...
}
}
This is Zig's way of supporting generics! Given some compile time known values, you can create a struct definition on the fly, at compile time. Here is a more detailed post about that capability: What is Zig's Comptime?.
For now, take careful note that write
is being passed to io.OutStream
as
the writeFn
argument, which will eventually be what is called to print to
standard out.
Alright, phew, so that's
const stdout = std.io.getStdOut().outStream();
We've ended up with an OutStream
struct with its context
field initialized
to the File
struct returned by std.io.getStdOut()
.
Now for the money business.
try stdout.print("Hello, {}!\n", .{"world"});
The definition of this "instance method" lives in lib/std/io/outStream.zig
.
pub fn print(self: Self, comptime format: []const u8, args: var) Error!void {
return std.fmt.format(self, format, args);
}
This dispatces self
to std.fmt.format
along with two more arguments. Let's
look at that function
pub fn format(
out_stream: var,
comptime fmt: []const u8,
args: var,
) !void {
//...
}
Ok, getting closer: out_stream
is in this case, the File
from way back
at the beginning.
Andy said: "Almost - it's the file.outStream() return value. Which is just the "Context" with the write function as part of the type. The way streams work in zig right now is with "duck typing". It optimizes well, the API is mostly good, but it can produce bloated code, and in some cases the API is annoyingly too generic. Sometimes it would be nice to accept a non-"var" type as a stream parameter."
The other two arguments are being passed in at the top level
call site, a string constant and an anonymous list
literal
(whose behavior is unsurprisingly similar to the aforementioned anonymous struct
literal) of positional arguments meant to be interpolated into the format
string at points marked by {}
. You can pass in formatting
options much like
c's printf
.
format
is a long function, there is a lot of bookeeping going on, but the
meat of it are its calls to out_stream.writeAll
. Jumping back to that
definition:
pub fn writeAll(self: File, bytes: []const u8) WriteError!void {
var index: usize = 0;
while (index < bytes.len) {
index += try self.write(bytes[index..]);
}
}
We can see that it calls into self.write
, which looks like:
pub fn write(self: File, bytes: []const u8) WriteError!usize {
if (is_windows) {
return windows.WriteFile(self.handle, bytes, null, self.intended_io_mode);
} else if (self.capable_io_mode != self.intended_io_mode) {
return std.event.Loop.instance.?.write(self.handle, bytes);
} else {
return os.write(self.handle, bytes);
}
}
And now, finally, we're down to the system
in systems programming
This
method operates differently depending on the system it's being used on! At the
top of this file lib/std/fs/file.zig
,
const is_windows = std.Target.current.os.tag == .windows;
I am not on windows, and I will for now ignore the second branch so I don't
have to get into async
(that's a whole other
potato!), so I end up here:
return os.write(self.handle, bytes);
I am calling into an os specific library function that accepts a place to write bytes and bytes to write (by this point formatted with those interpolated values from the call site). Here's where it gets good.
os.write
calls into system.write
which is defined per architecture
pub const system = if (@hasDecl(root, "os") and root.os != @This())
root.os.system
else if (builtin.link_libc)
std.c
else switch (builtin.os.tag) {
.macosx, .ios, .watchos, .tvos => darwin,
.freebsd => freebsd,
.linux => linux,
.netbsd => netbsd,
.dragonfly => dragonfly,
.wasi => wasi,
.windows => windows,
else => struct {},
};
For me, that ends up being linux
, defined here:
pub const linux = @import("os/linux.zig");
So in my case, system.write
ends up being:
pub fn write(fd: i32, buf: [*]const u8, count: usize) usize {
return syscall3(.write, @bitCast(usize, @as(isize, fd)), @ptrToInt(buf), count);
}
where syscall3
is imported directly into the namespace according to
architecture:
pub usingnamespace switch (builtin.arch) {
.i386 => @import("linux/i386.zig"),
.x86_64 => @import("linux/x86_64.zig"),
.aarch64 => @import("linux/arm64.zig"),
.arm => @import("linux/arm-eabi.zig"),
.riscv64 => @import("linux/riscv64.zig"),
.mips, .mipsel => @import("linux/mips.zig"),
else => struct {},
};
here, the "3" in syscall3 refers to the number of "arguments" required for the syscall being invoked.
for me, that's x86_64
, and it looks like this:
pub fn syscall3(number: SYS, arg1: usize, arg2: usize, arg3: usize) usize {
return asm volatile ("syscall"
: [ret] "={rax}" (-> usize)
: [number] "{rax}" (@enumToInt(number)),
[arg1] "{rdi}" (arg1),
[arg2] "{rsi}" (arg2),
[arg3] "{rdx}" (arg3)
: "rcx", "r11", "memory"
);
}
This is close to but not exactly what the compiler will actually emit for this call. Amazingly, even all the way down here, where we're seeing register names invoked directly, there is still a layer of abstraction within which the compiler has some wiggle room.
Not so simple a program now, is it? Remember, these syscalls differ for each architecture, the compiler produces machine code based on what you're targeting, so this is just one of many possible paths. I think that it is very easy to forget how complicated this can quickly become when you poke around in the details.
Bottoms up 🍺
Let me come at this from a slightly different angle now. We know that the Zig compiler's job, just like any compiler, is to take source code and turn it into something else. Zig is highly portable; using llvm as a backend means it can target basically anything that llvm targets, with the caveat that not all library functions will have the same amount of support on all platforms (see the support table) for more detail on that).
So, there are many possible targets, and so there are many possible "something else"s for the source to be turned into. But let's look at the most obvious case: building an executable that targets my current running system.
The transformation pipeline inside the compiler goes from source
->
intermediate representation(s)
-> target
, where intermediate
representations
could be many things and include many steps. Zig has its own
IR, as a matter of fact, on which it runs its own static analysis processes
before transforming it to LLVM IR and passing it along where is could be
processed through many possible optimazation and compilation/assemblage
steps (LLVM calls these "passes").
target
, for me, is x86_64 machine code, but the last stop before that,
conceptually as well as most probably actually, is assembly itself.
Because clang is a full compiler toolchain built on llvm, and Zig can be used
as a drop in replacement for
clang,
we should be able to use zig cc
to compile assembly code directly into machine code.
This step is actually called assembling, not compiling, and is done by an
"assembler," instead of a compiler, but tbqh and imho these are distinctions
without much of a
difference.
What is the advantage of using zig cc
? Primarily that you are able to
reliably use the same toolchain and version of llvm that the version of zig you
are using relies on. No futzing around with system libraries and linkages, it's
all just ready to work.
So! I'll make an empty file:
$ touch hello.s
clang
is smart enough to detect a filetype by its extension, and so, so is
zig cc
.
$ zig cc hello.s
zig: warning: argument unused during compilation: '-nostdinc' [-Wunused-command-line-argument]
zig: warning: argument unused during compilation: '-fno-spell-checking' [-Wunused-command-line-argument]
zig: warning: argument unused during compilation: '-fno-omit-frame-pointer' [-Wunused-command-line-argument]
zig: warning: argument unused during compilation: '-D _DEBUG' [-Wunused-command-line-argument]
zig: warning: argument unused during compilation: '-fstack-protector-strong' [-Wunused-command-line-argument]
zig: warning: argument unused during compilation: '--param ssp-buffer-size=4' [-Wunused-command-line-argument]
zig: warning: argument unused during compilation: '-isystem /usr/local/include' [-Wunused-command-line-argument]
zig: warning: argument unused during compilation: '-isystem /usr/include/x86_64-linux-gnu' [-Wunused-command-line-argument]
zig: warning: argument unused during compilation: '-isystem /usr/include' [-Wunused-command-line-argument]
lld: error: undefined symbol: main
>>> referenced by start.S:104 (/home/jfo/code/zig/build/lib/zig/libc/glibc/sysdeps/x86_64/start.S:104)
>>> /home/jfo/.cache/zig/stage1/o/ujWleITFBRHwV19Tq0gsSK_F_gRDc7-jgOCip86Un1bhdmx0pIXLGQRxtjMtuntC/Scrt1.o:(_start)
Most of this is just telling us that the flags zig is passing to clang by default weren't used for anything, which isn't much of a surprise since there was nothing to compile! (Strictly speaking, this is a bug in the zig compiler, but for our purposes it has no effect) There is a real error here, too.
lld: error: undefined symbol: main
lld
is the linker bundled with llvm bundled with
clang, and so bundled with zig, and it is complaining that this program (which
is empty) that we're trying to turn into an executable doesn't have an entry
point. How would you run it? Where would you start? A reasonable complaint, this one.
We can instead build an "object file" that isn't intended to be executable by
passing the -c
flag.
These options are the same options as clang, as all of these arguments are simply being forwarded to clang along with the compiler flags set by zig as defaults.
$ zig cc -c hello.s
This throws all the same warnings as before, but it succeeds, and produces
hello.o
, an object file.
Running file
on this
output
$ file hello.o
Will tell us what we've got.
hello.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
This is essentially a bundle of machine code that would be suitable for linking into other programs during their own linking phase, if there was actually any code in there at all to be used. As it stands, there's just the ELF header to identify the file type and what I assume to be a bit of metadata and some padding.
But we wanted to actually make an executable, so we need an entry point! What does an entry point look like?
In x86 assembly, the default entry point looks like:
_start:
I add that to the file, and try again:
$ zig cc hello.s
and
lld: error: undefined symbol: main
>>> referenced by start.S:104 (/home/jfo/code/zig/build/lib/zig/libc/glibc/sysdeps/x86_64/start.S:104)
>>> /home/jfo/.cache/zig/stage1/o/ujWleITFBRHwV19Tq0gsSK_F_gRDc7-jgOCip86Un1bhdmx0pIXLGQRxtjMtuntC/Scrt1.o:(_start)
(I have left out the assembler warnings from above).
What? If the default entry point is _start
, why is it asking for main
, then?
Looking at the error, it is trying to load _start
, just not our start.
When you compile a regular zig or c program with a main
function, your
program doesn't actually start at main, it also starts at _start
, which
is responsible for doing memory setup and generally getting everything tidy for
you before your main
function runs. Remember, here we're just using zig cc
as a pass through for clang, and so the real definition of _start
resides
in libc, and looks like
this.
There are two ways I can solve this now. First, I can just make an assembly
file with main:
instead... let's try that.
main:
$ zig cc hello.s
lld: error: undefined symbol: main
>>> referenced by start.S:104 (/home/jfo/code/zig/build/lib/zig/libc/glibc/sysdeps/x86_64/start.S:104)
>>> /home/jfo/.cache/zig/stage1/o/ujWleITFBRHwV19Tq0gsSK_F_gRDc7-jgOCip86Un1bhdmx0pIXLGQRxtjMtuntC/Scrt1.o:(_start)
Ah, remember pub fn main()
? In addition to being defined, this symbol also
needs to be available to the linker. In assembly, that means putting it as a
.globl
declaration.
.globl main
main:
Without that, the assembler is free to discard or mangle the label since it assumes it's not needed for any other steps.
$ zig cc hello.s
This assembles! When I run the resulting executable, I get:
Trace/breakpoint trap (core dumped)
This isn't surprising, I've written a program that has no instructions. I'm not particularly interested in this error right now, the important thing is that I got this to assemble.
I'm also not interested in using the libc startup code and _start
call; I
want to do everything myself.
I will try to definie my own .globl _start
:
.globl _start
_start:
$ zig cc hello.s
lld: error: duplicate symbol: _start
>>> defined at start.S:63 (/home/jfo/code/zig/build/lib/zig/libc/glibc/sysdeps/x86_64/start.S:63)
>>> /home/jfo/.cache/zig/stage1/o/ujWleITFBRHwV19Tq0gsSK_F_gRDc7-jgOCip86Un1bhdmx0pIXLGQRxtjMtuntC/Scrt1.o:(_start)
>>> defined at zig-cache/o/UbeRHF13NXWYrl3f0cuxy3OffVcjvaAbWr5e2svkLcYXPpPG03d4JplnyJU7tFJ7/empty.o:(.text+0x0)
Given what we've seen so far, this makes complete sense. _start
has already
been defined in the libc code, so simply putting it here results in this error,
because of course it does.
The linker takes an option to simply not use anything in the standard library, which is exactly what I want.
$ zig cc -nostdlib hello.s
./a.out
Segmentation fault (core dumped)
Hello world!
Alright, now I've sussed out the precise incantations to go directly from an
x86 assembly file (.s
) to an executable. What is the smallest assembly
program I can write?
That would be a program that simply exits.
.intel_syntax noprefix
.globl _start
_start:
mov rax, 60
syscall
I am using intel syntax here, the first line tells the assembler that. An interesting note here, is that llvm doesn't need you to explicitly say
noprefix
butgcc
does, so it makes sense to always use it.
What's happening here? I'm just putting a static value: 60
, into the rax
register, and then making a syscall
. The syscall looks at the rax
register
and does what the value inside of it corresponds to, which is sys_exit
, so
the program exits. That's it.
When I compile and run this program, nothing happens, but
strace
tells me that it's
doing exactly what I expected:
zig cc -nostdlib empty.s && strace ./a.out
execve("./a.out", ["./a.out"], 0x7ffd47b269c0 /* 104 vars */) = 0
exit(0) = ?
+++ exited with 0 +++
Futhermore, the sys_exit
syscall looks in the rdi
register to get the
value it returns to the calling process. I can put whatever I want in there
before executing the syscall.
.intel_syntax noprefix
.globl _start
_start:
mov rdi, 0xface
mov rax, 60
syscall
execve("./a.out", ["./a.out"], 0x7fffa91a9be0 /* 104 vars */) = 0
exit(64206) = ?
+++ exited with 206 +++
You'll notice, that even though I loaded a 16 bit value into the register (
0xface
, which is equivalent to64206
), andr
prefixed registers are 64 bits wide, it only looked at the bottom 8 bits (0xce
or206
). This leads me to believe that error codes must be between 1 and 255. It seems as though there are only 131 actual specified standard errors, so I am sure this is some ancient magick that limits the enumerated error types to 8 bits from the kernel's perspective.
Running this without strace, nothing happens, which surprised me, actually. I
thought that 0
was a success code and anything else was an error code. This
is conventionally the
case!
But it's up to the caller to interpret that code and respond to it. For my
little program, there is no error handling that reports back to the user what's
going on, it just dutifully exits with the value I gave it and that's that.
Though the origins of the hello world are well known (it was the very first example in the hugely influential K&R C book), the definition of a hello world has grown over the years. Though the canonical example is still "output the literal text 'Hello World!'", lots of paradigms and systems have their own version of it- my favorite is the arduino blink; it's the simplest thing you can do with it that proves it's working as intended. I've written about this quality previously.
By that measure, the above code is already an assembly "Hello World!" in that we've compiled it and proved that it works. But it's a very small step from here to get to a literal "Hello World!"
Here it is:
.intel_syntax noprefix
.globl _start
_start:
mov rax, 0x1
mov rdi, 0x1
lea rsi, msg
mov rdx, 14
syscall
xor rdi, rdi
mov rax, 60
syscall
msg:
.ascii "Hello, world!\n"
You can see that there's not really that much more here. A little more preamble at the top, and then two syscalls instead of one, followed by a little data section that actually holds the text we're printing to the screen. Let's go through this, line by line.
.intel_syntax noprefix
Trying to compile in gcc gives me errors:
hello.s: Assembler messages:
hello.s:4: Error: ambiguous operand size for `mov'
hello.s:5: Error: ambiguous operand size for `mov'
hello.s:6: Error: too many memory references for `lea'
hello.s:7: Error: ambiguous operand size for `mov'
hello.s:9: Error: ambiguous operand size for `mov'
hello.s:10: Error: too many memory references for `xor'
but it works just fine with clang.
.globl _start
We know this one: marking the _start
symbol available to the linker.
Next, on the the body of the program:
_start:
mov rax, 0x1
mov rdi, 0x1
lea rsi, msg
mov rdx, 14
syscall
Looking at this chart of linux
syscalls,
I can see that what is going to be in rax
when I reach syscall
is 1
,
which corresponds to the SYSWRITE
system call. Its "arguments" live in the
registers rdi
, rsi
, and rdx
:
%rax: 1
System call: sys_write
%rdi: unsigned int fd
%rsi: const char *buf
%rdx: size_t count
Where fd
(file descriptor) is for the output stream, buf
is a pointer to
the buffer from which we want to write and count
is the number of bytes we
want to write.
What are we putting into those registers, then?
In rax
, we insert the syscall number for sys_write
: 1
rdi
: The file descriptor for stdout
, defined by
POSIX)
to be 1
rsi
: a pointer to the buffer we write from. Here, we see the lea
instruction for "load effective
address", and the right hand value is a label msg
that is defined at the bottom of the file:
msg:
.ascii "Hello, world!\n"
Finally, in rdx
we put in 14
, the length of the buffer. It is not runtime
known, since there is no runtime!
With all of these loaded into the appropriate registers,
syscall
Executes sys_write
and writes the contents of the memory buffer to stdout.
The remaining three lines:
xor rdi, rdi
mov rax, 60
syscall
Might be familiar; we're loading something into rdi
and calling sys_exit
(60), just like the first example. In this case though, we're xor
ing rdi
with itself, which is the same thing as setting it to 0
: the success code.
And that's it!
Coda:
My intention from here was to find exactly where the zig compiler emits these
exact instructions. And it does do that, sort of... I mean, it has to,
because these are the instructions to do this operation on x86. You may have
noticed, too, that the pertinent registers figure prominently in the syscall3
inline asm from above... rax
, rdi
, rdx
, and rsi
:
// ...
: [number] "{rax}" (@enumToInt(number)),
[arg1] "{rdi}" (arg1),
[arg2] "{rsi}" (arg2),
[arg3] "{rdx}" (arg3)
// ...
But as alluded to earlier, even here at this lowest of levels (I mean, this is almost machine code, right?) there is flexibility for the compiler to be creative. And I suppose the specifics of the compiler's output from this command:
$ zig build-exe hello.zig -femit-asm --strip --single-threaded --release-small
are in the end a bit extraneous to my ultimate point, which is that even hello world is complicated.
It is, canonically, the "simplest" program you can write, and yet it is built on top of heaps of abstracted complexity that for the most part, none of us ever really think about all that much. Do you remember the first time you heard some systems engineer refer to C as a high level langauge? Did it sound weird? Do you remember the first time you realized that it actually is a high level language?
As programmers, our sharpest tool is abstraction, our strongest tool is abstraction, and our most useful tool is abstraction. It is in some sense the only thing we really do: turn information and transformations upon information into other forms of meta information that we manipulate with even more abstractions. The whole idea is that we deal with emergent complexity and then tuck it neatly beneath an interface of some sort and then don't ever think about it again until we have to. But it's still there, bubbling under the crust of the world we're continuously saying hello to, and it's worth it sometimes to dig down a little deeper and marvel at the gems.