Monday, 24 September 2012

The Lazarus Blog

Here is a new blog I'm starting, which will hopefully be updated more frequently than this one. In retrospect, I think I made the scope for this one is too narrow. The plan is to still blog about Linux/programming and related topics (as you may have guessed by the title), but to loosen the topic of acceptable material to general programming and... well, basically anything.

Tuesday, 23 March 2010

Libraries - Your Own and The System's - Part 3

This post will briefly cover a range of very non-essential information about shared libraries, starting with the dynamic linker and also covering playing around with programs like objdump and strip.

Then we'll do something a bit more interesting - with some knowledge of how we can change the behaviour of the dynamic linker, we can "intercept" calls to a shared library and do whatever we want, including gathering statistics, passing through the arguments to the real function or pretending an error occurred to see what would happen. This means we can intercept standard functions like puts(), malloc() and pthread_create(), and we don't even need to recompile either the program or the library in order to do this.

The Linux Dynamic Linker

The dynamic linker for Linux is, at the time of writing,, and is usually found in the /lib/ directory, along with other essential system libraries like the standard C library. The main job of a dynamic linker is to find the shared libraries a program needs, and to load the relevant parts of those shared libraries into memory, ready for the program to use.

Obviously, if ld-linux has to find libraries, you have to have some way of telling it where those libraries are. First off, there are built-in library search paths it looks in, usually /lib/ and /usr/lib/. Then there are two main ways of specifying library search paths after build-time. The first is to use the environment variable LD_LIBRARY_PATH, which is a ':'-separated list of directories the dynamic linker should look in, much like the PATH environment variable used to tell bash where to look for executables.

Generally, you'd just use LD_LIBRARY_PATH for quick-and-dirty tests, like when we need to specify where a newly created library is in this post.

Note that LD_LIBRARY_PATH is searched first, so you could play havoc trying to run any program if you get things wrong. Here, we put an empty C library into the current directory and set LD_LIBRARY_PATH to search there, so it'll find that first, which certainly won't be healthy:

$ ldd `which ls` | grep libc => /lib/tls/i686/cmov/ (0xb75f4000)
$ touch
$ LD_LIBRARY_PATH=`pwd` ls
ls: error while loading shared libraries: /home/john/test/
file too short

Using /etc/ is much neater if you need a more permanent solution. It contains a list of places to search for shared libraries at run-time, with one on each line, not ':'-separated. Unlike LD_LIBRARY_PATH, however, the directories in this file aren't searched every time you run a program.

Instead, the ldconfig program searches the paths it finds in /etc/ and caches their names and locations in /etc/ This means that any time you add a new directory with new shared libraries to /etc/, or even if you add new shared libraries to the the directories already in /etc/, you have to re-run the ldconfig program, and you have to run it as root.

Affecting the dynamic linker

Aside from telling the dynamic linker where to find extra libraries, there are a couple of interesting environment variables that'll affect its behaviour.

Firstly, you can get the linker to produce debugging output by setting LD_DEBUG in the environment. The most useful thing to set this to is libs, as this will show you where it's searching for library dependencies, and exactly what libraries the linker finds and uses. You can also set it to symbols, which will show you exactly which library the linker gets each and every symbol from. These will produce a lot of output and send it to stderr, unless you specify somewhere else you want the output in LD_DEBUG_OUTPUT. You can find out what other kinds of debug output are available by setting LD_DEBUG=help and running any dynamic executable.

Then there's LD_BIND_NOW, which I always set in my environment when I'm developing (it doesn't matter what you set it to, so long as it's a non-zero). The normal behaviour of the linker is to ensure that all the libraries that the executable says it needs are present before running the program. Then, when the program tries to actually use a symbol from one of the libraries, the linker goes and searches through the aforementioned libraries for that symbol. If it finds it, your program continues running in the usual way - if it can't, you get an unavoidable crash.

Setting LD_BIND_NOW changes this behaviour - instead of resolving symbols as they're needed, setting this will force the linker to resolve all symbols before control is handed over to your program. This is incredibly useful to a programmer - it means you can be sure that your compile-time library and your run-time library are compatible, and any differences will get pointed out to you as soon as possible. Otherwise, an unresolved symbol might go unnoticed indefinitely, simply because some error condition never occurs when an executable actually runs.

You can see the effect of LD_DEBUG and LD_BIND_NOW by running a simple program:

$ cat > prog.c << 'eof'
> #include <stdio.h>
> #include <unistd.h>
> int main()
> {
> puts("hello...");
> sleep(2);
> puts("...goodbye");
> }
> eof
$ gcc prog.c -o prog
$ ./prog

And now we'll run the same program, but we'll get the dynamic linker to tell us when it resolves a symbol. You get lots of output from the linker, so below shows only the very end of the output:

$ LD_DEBUG=symbols ./prog
(lots of output...)
3347: transferring control: ./prog
3347: symbol=puts; lookup in file=./prog [0]
3347: symbol=puts; lookup in file=/lib/tls/i686/cmov/ [0]
3347: symbol=sleep; lookup in file=./prog [0]
3347: symbol=sleep; lookup in file=/lib/tls/i686/cmov/ [0]
3347: calling fini: ./prog [0]
3347: calling fini: /lib/tls/i686/cmov/ [0]

3347 is the program's PID. As you can see, the output shown starts when control is transferred to prog. The next lines show the linker looking for the symbol puts, starting in the executable and then moving onto, where it is of course found - it can then be called, and we see the hello... message from the call to puts. The linker then goes on a similar search, this time looking for sleep, and then we see ...goodbye without a symbol search - puts has already been found and stored internally, ready to be quickly used again (you can change that behaviour though, as we'll do later in this post).

What if the linker hadn't been able to find sleep? Well, you'd have half a program run - we can prevent this using LD_BIND_NOW:

$ LD_DEBUG=symbols LD_BIND_NOW=yes ./prog
(lots of output...)
3392: symbol=__libc_start_main; lookup in file=./prog [0]
3392: symbol=__libc_start_main; lookup in file=/lib/tls/i686/cmov/ [0]
3392: symbol=sleep; lookup in file=./prog [0]
3392: symbol=sleep; lookup in file=/lib/tls/i686/cmov/ [0]
3392: symbol=puts; lookup in file=./prog [0]
3392: symbol=puts; lookup in file=/lib/tls/i686/cmov/ [0]
3392: symbol=_r_debug; lookup in file=./prog [0]
3392: symbol=_r_debug; lookup in file=/lib/tls/i686/cmov/ [0]
3392: symbol=_r_debug; lookup in file=/lib/ [0]
(...more output...)
3392: transferring control: ./prog
3392: calling fini: ./prog [0]
3392: calling fini: /lib/tls/i686/cmov/ [0]

Here, puts and sleep are resolved before control is transferred to prog - if there's a problem with any symbol the program uses, no matter if it's from an esoteric, never-used error-handler, you're guaranteed to know about it as soon as you run your program.

There are other environment variables that affect the dynamic linker - they're detailed in the ld-linux(8) man page. We'll be using one of these extra ones, LD_BIND_NOT, later.

Symbol visibility and stripping libraries

Not all symbols in a shared library are available for linking against - just because it's there, doesn't mean you can use it.

Lets take an example - a simple library with two dummy functions:

$ cat > library.c << 'eof'
static void foo() {}
void bar() { foo(); }
$ gcc library.c -shared -o -fPIC
$ nm | egrep 'foo|bar'
00000431 T bar
0000042c t foo

So, bar() is an ordinary function, and ends up in library as being of type "T", which means it's in the text (code) section of the resulting shared library. Nothing new there. foo(), on the other hand, has been declared static. This is not just a language-level construct that's only implemented in C - it's also enforced at the level of the linker. The symbol foo is shown to be of type "t". This is exactly the same as "T", except that it means that foo is a local symbol, and so a linker can't (in reality, will refuse to) link against it. It can still be used internally by anything in, though.

A couple of examples (that we'll also use a bit later) show that we can use bar():

$ cat > use-bar.c << 'eof'
> int main() { bar(); }
> eof
$ gcc use-bar.c -o use-bar -L. -lrary
$ LD_LIBRARY_PATH=. ./use-bar

But we can't even link if we try and use foo():

$ cat > use-foo.c << 'eof'
> int main() { foo(); }
> eof
$ gcc use-foo.c -o use-foo -L. -lrary
/tmp/ccsw1ueO.o: In function `main':
use-foo.c:(.text+0x7): undefined reference to `foo'
collect2: ld returned 1 exit status

Even though there is definitely a symbol foo in, our linker has refused to use it.

And now onto stripping programs and libraries, which will be fairly simple if you've understood the previous discussion.

Because programs and libraries contain some (usually lots) of symbols they simply don't need, we can just get rid of them. For a program, the symbols it exposes don't matter - all internal references are against sections and/or offsets. And through the magic of virtual memory, the program can always be loaded at the same virtual address, regardless of where it actually is in physical memory:

$ objdump -x use-bar | egrep -w 'start address|_start'
start address 0x08048410
08048410 g F .text 00000000 _start

The above shows that the symbol _start will always be located at virtual address 08048410, and that this is the start address of the program. Having this particular symbol present is purely a mechanism that allows us to inspect object files and libraries more easily - it's never used when executing the program, only the hard-coded address is.

So, let's use the strip program to cut-down the size of use-bar. This is how it starts out:

$ du -bh use-bar
8.1K use-bar
$ nm use-bar
08049f18 d _DYNAMIC
0804859c R _IO_stdin_used
w _Jv_RegisterClasses
08049f08 d __CTOR_END__
08049f04 d __CTOR_LIST__
08049f10 D __DTOR_END__
08049f0c d __DTOR_LIST__
080485a0 r __FRAME_END__
08049f14 d __JCR_END__
08049f14 d __JCR_LIST__
0804a014 A __bss_start
0804a00c D __data_start
08048550 t __do_global_ctors_aux
08048440 t __do_global_dtors_aux
0804a010 D __dso_handle
w __gmon_start__
0804854a T __i686.get_pc_thunk.bx
08049f04 d __init_array_end
08049f04 d __init_array_start
080484e0 T __libc_csu_fini
080484f0 T __libc_csu_init
U __libc_start_main@@GLIBC_2.0
0804a014 A _edata
0804a01c A _end
0804857c T _fini
08048598 R _fp_hw
0804839c T _init
08048410 T _start
U bar
0804a014 b completed.6990
0804a00c W data_start
0804a018 b dtor_idx.6992
080484a0 t frame_dummy
080484c4 T main

8.1 kB in size, and 35 symbols. To strip it, just pass it as an argument to strip:

$ strip use-bar
$ du -bh use-bar
5.4K use-bar
$ nm use-bar
nm: use-bar: no symbols

So we've reduced our executable's size by a third, and gotten rid of all of the symbols. At this stage, use-bar does still contain some symbols - they're all symbols it won't know the value of until runtime, such as where certain sections are, and what undefined symbols it has - i.e. they're all dynamic:

$ nm --dynamic use-bar
0804859c R _IO_stdin_used
w _Jv_RegisterClasses
0804a014 A __bss_start
w __gmon_start__
U __libc_start_main
0804a014 A _edata
0804a01c A _end
0804857c T _fini
0804839c T _init
U bar

Now we want to do the same for our library - but there's a problem. We can't remove all the symbols - if we did that, both the link-time linker and the dynamic linker would have no way of finding out if the symbols it needs are even in our library, let alone how to actually get hold of the code/data it needs at run-time!

We can, however, only get rid of any unnecessary symbols by passing --strip-unneeded when we strip:

$ du -bh
$ strip --strip-unneeded
$ du -bh

And we can still link and run our use-bar program, just as before.

The savings for are around 20%, but these figures scale well when you strip larger libraries, especially if they're written in C++ since name-mangling can make for ludicrously long symbol names. For example, I just tried stripping the unneeded symbols from a C++ library I'm developing at work - it reduced it from 481 kB to 106 kB.

So why would you even want to keep these unneeded symbols, if they're taking up that much space? Debugging is probably the main thing you'd want them for - a debugger can associate offsets and locations within an executable with its symbols to show you the function you're in or the variable you're accessing. Without these symbols, it can just say what address you're accessing, which is basically useless. This is why libraries and programs are usually only stripped when they're released.

Wrapping library functions called from pre-compiled binaries

Now we'll do something a bit off the wall - we're going to create a wrapper for a standard shared library function that we'll be able to use without having to recompile either the library we're wrapping or the executable whose calls we're intercepting. Our function will be able to do whatever it wants before/after calling the real library function, so the functionality provided is exactly the same, as far as the program's concerned. This method, of course, can only work with shared libraries.

This can be useful for, for example, testing when memory is allocated or freed with malloc()/free(), when threads are created, and so on, when you don't want to (or simply can't) recompile it from source. You could also use this technique for error testing - want to know how your program will react if a standard library call fails? Just wrap it up in something that'll pretend to fail, returning an appropriate error code (don't forget to emulate the full behaviour of the call, like setting errno to something appropriate!).

So, on with it...

When the dynamic linker first encounters an as-yet unresolved symbol, it searches through the list of libraries the executable needs and checks each one for that symbol, until it finds one with the correct symbol. However, the list of shared libraries a program needs is embedded into the executable at link-time, and our library won't be on that list. We get around this by setting the LD_PRELOAD environment variable, which specifies a list of extra libraries to load and search before any others.

So, to demonstrate the first stage (i.e. getting our function called instead of the standard library function) we'll need a program with a call to, say, the puts funcion that we can intercept, and some code to put in a shared library to masquerade as puts. Note that we'll compile a simple test program now, and we won't have to recompile it again. Also note that we don't have to put anything exotic on the command-line when we compile our program (...and the cards are neatly shuffled, and there's nothing up my sleeves...). Afterwards, we'll intercept calls from a more complicted program - cat.

So, our test program is:

$ cat > program.c << 'eof'
#include <stdio.h>

int main(void)
puts("Don't stop me now");

$ gcc program.c -o program
$ ./program
Don't stop me now

Okay, so we just need our "stooge" puts implementation, that'll just print an extra space between each character:

$ cat > wrapper.c << 'eof'
> #include <stdio.h>
> int puts(const char *s)
> {
> while (*s)
> {
> putchar(*s++);
> putchar(' ');
> }
> putchar('\n');
> return 0;
> }
> eof
$ gcc wrapper.c -shared -o -fPIC

And now we can just LD_PRELOAD our library, and our function will be the preferred version of puts:

$ LD_PRELOAD=./ ./program
D o n ' t s t o p m e n o w

One important thing to note is that the function signature should be the same as for the real puts, including the return type. If you know how arguments and return values are passed between functions, you'll know how important that is - if not, that's something I plan to go into at some other point. Either way, you can't go wrong if you just use the same signature, so include the relevant header file and the compiler will complain if you get it wrong.

The next stage is to pass the function call through to the real function, so we can call arbitrary functions without having to do too much work. Now, we obviously can't just call the real function, because the dynamic linker would think that's a call to our faux function, and we'd end up in an infinite loop. The programming interface to the dynamic linker, however, has a feature that's exactly what we need.

In short, the programming interface to the dynamic linker allows us to request, at run-time, to open a dynamic library and resolve symbols from it. This is mostly used for plugins - a program can search for dynamic libraries in some directory it knows contains plugins, open them query them, and access their symbols. Of course, it only allows you to deal with raw symbols, so all type-safety is lost, but it'll let us do what we need for this purpose.

The function dlsym() (declared in dlfcn.h) returns a pointer to the location of a symbol in memory. It takes two arguments - a handle to a shared library to search, and the name of a symbol to search for. Normally, a handle is obtained from dlopen(), but it's more convenient for us to use the special handle RTLD_NEXT (this is a GNU extension, so we need to define _GNU_SOURCE to use this). As its name implies, this special handle directs dlsym() to start searching for the symbol from the next library in the dynamic linker's search path onwards - this is exactly the symbol that would have been found if our function hadn't barged in and been found first.

So, we can put this into practice with a new version of wrapper.c, which we now have to link against

$ cat > wrapper.c << 'eof'
#define _GNU_SOURCE
#include <stdio.h>
#include <dlfcn.h>

int (*__real_puts)(const char *s) = NULL;
int lines = 0;

int puts(const char *s)
if (__real_puts == NULL)
__real_puts = dlsym(RTLD_NEXT, "puts");

if (__real_puts == NULL)
return EOF;

printf("line %d: ", ++lines);

return __real_puts(s);

$ gcc wrapper.c -shared -o -fPIC -ldl
$ LD_PRELOAD=./ ./program
line 1: Don't stop me now

And there you have it - library functions intercepted with no recompiling.

We can make this slightly neater by using gcc's function attributes - have a look here if you want details of them. We'll use the constructor attribute, which tells gcc to run a function before main() is called, and the destructor attribute, which tells gcc to run a function after main(), or when exit() is called:

$ cat > wrapper.c << 'eof'
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>

int (*__real_puts)(const char *s) = NULL;
int lines = 0;

void __attribute__ ((__constructor__))
__real_puts = dlsym(RTLD_NEXT, "puts");

if (__real_puts == NULL)

void __attribute__ ((__destructor__))
printf("Wrote %d line(s)\n", lines);

int puts(const char *s)
printf("line %d: ", ++lines);

return __real_puts(s);

$ gcc wrapper.c -shared -o -ldl
$ LD_PRELOAD=./ ./program
line 1: Don't stop me now
Wrote 1 line(s)

Example - changing the behaviour of cat

Now we'll have some fun - we'll use this method to massively over-complicate the common beginner's exercise of printing a program's source code along with it's line numbers. We'll do this by finding and intercepting whatever call the standard UNIX utility cat uses to print to the screen.

To find out what function call cat uses to print lines, we can use LD_DEBUG=syms to find out what symbols the dynamic linker needs to resolve, and also LD_BIND_NOT (set to anything). Normally, when the linker is asked to lookup a symbol, it stores its value so it doesn't need to look it up again - LD_BIND_NOT inhibits this behaviour. This means we can look at the symbol that's been looked up just before cat actually prints any output, and know that that's the call we need to intercept:

$ echo 'This is our output' | LD_DEBUG=symbols LD_BIND_NOT=yes cat
(...lots of output...)
3032: symbol=write; lookup in file=cat [0]
3032: symbol=write; lookup in file=/lib/tls/i686/cmov/ [0]
This is our output
(...more output...)

So, the call we obviously need to intercept here is write, since that occurs just before our output is printed. This is using cat from GNU Coreutils version 7.4.

If you're not familiar with write(), have a look at it's man page. In short, its signature is ssize_t write(int fd, const void *buf, size_t count). It prints count characters from buf to fd, and returns the number of characters written, or -1 after setting errno.

Here's an implementation of that does what we want:

/* write-wrapper.c
* Copyright (c) 2010 John Graham (
* Wrapper for the write() system call that prints the line
* number of anything written to stdout using this call,
* along with the text that would have been printed.
* See

#define _GNU_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <dlfcn.h>

/* pointer to the "real" write() */
ssize_t (*__real_write)(int fd,
const void *buf,
size_t count);

/* __constructor__ means this is called before main() */
void __attribute__ ((__constructor__))
__real_write = dlsym(RTLD_NEXT, "write");

if (__real_write == NULL)
puts("Error: could not find symbol \"write\" - exiting");

int current_line = 1;

ssize_t write(int fd, const void *buf, size_t count)
// 1 is stdout - pass writes to any other file
// descriptor through to the real write()
if (fd != 1)
return __real_write(fd, buf, count);

// strsep alters the string we pass it,
// so make a copy
char *buf_cpy = strndup(buf, count);

if (buf_cpy == NULL)
return -1;

// purely so we can free() later
const char *_buf_cpy = buf_cpy;

char *buf_ptr = strsep(&buf_cpy, "\n");

int ret = printf("Line %d: %s\n", current_line++, buf_ptr);

if (ret < 0)
return -1;

buf_ptr = strsep(&buf_cpy, "\n");
while (buf_cpy != NULL);

free((void *)_buf_cpy);

return count;

/* end write-wrapper.c */

Compile as we did before, and test on its own source code:

gcc write-wrapper.c -shared -o -fPIC -ldl
$ LD_PRELOAD=./ cat write-wrapper.c
Line 1: /* write-wrapper.c
Line 2: *
Line 3: * Copyright (c) 2010 John Graham (
Line 4: *
Line 5: * Wrapper for the write() system call that prints the line
Line 6: * number of anything written to stdout using this call,
Line 7: * along with the text that would have been printed.
Line 8: *
Line 9: * See
Line 10: */

(...and so on...)

Line 81:
Line 82: /* end write-wrapper.c */

As a bonus, we get to use the features of cat for free, like it's ability to take many files on the command-line, or to read from stdin by using the file - (assuming our implementation of the above is correct... corrections welcome!).

Saturday, 20 March 2010

Libraries - Your Own and The System's - Part 2

This post will be about system libraries - such as those you get in development packages in most Linux distributions - and the mechanisms used to allow you to use multiple versions of the same library.

System libraries

This is the sort of stuff that should be explained to new programmers on Linux, but doesn't really seem to be. *-dev/*-devel packages on Linux were kind of confusing to me at first, and this stuff can be tricky to explain, so if you get lost, the example in the next section should make things clearer.

As you should know by now, some (most) programs use shared libraries. These libraries expose symbols that are associated with a certain block of code or data. These symbols need to be visible at two distinct stages of linking.

The first stage is at link-time. The linker (e.g. ld) needs to be able to see some evidence of a symbol to know that it's there. It can then put information about where to get the code/data associated with the symbol into the executable. The second stage is at run-time. The dynamic linker (on Linux, usually has to use the information in the executable to find the appropriately named library that contains the symbol, so that it can load the associated code/data into memory (unless it's already been loaded into memory by another program...), ready to be executed.

Because of this, the shared library that the dynamic linker sees at run-time has to be compatible with the one the linker saw at link-time. If a function in the library is updated for a bugfix, everything'll work just fine (unless the program depended on the bug - a major problem on Windows, according to some Windows programmers I know), but if its API is changed (i.e. a function is redefined, or simply given a different name) then the dynamic linker won't be able to find the relevant symbol and the program will simply crash or - even worse - start executing a function that does something different that it expects, which when you think about it is unlikely to end well.

To overcome this problem, the name of the library searched for by the dynamic linker is different to the name searched for by the link-time linker*. When you specify -lX at link-time, the linker searches for a file named For example, on my system, (the library that contains definitions of standard C mathematical functions in <math.h>, such as cos and sin) is located at /usr/lib/, so passing -lm to the linker makes it search for this file. However, is not a normal file, but a symlink. There are usually two "levels" of symlinking when it comes to shared libraries:

$ ls {/usr,}/lib/*
lrwxrwxrwx 1 root root 14 2010-01-13 09:47 /lib/ ->
lrwxrwxrwx 1 root root 14 2010-01-13 09:47 /usr/lib/ -> /lib/

The real library is /lib/ This means you can develop by just linking with, and you'll automatically use the latest version of the library.

There is, of course, another aspect of the naming scheme to consider. When was compiled, somewhere on the (probably huge) command-line would have been an option like This soname is the filename that the link-time linker should tell the dynamic linker to search for. This way, if you get a bugfix version of a library like, your symlink will be updated to point to this newer library. All your programs that want to use will now use the new code, and none of them need to be recompiled.

Additionally, you could update to, and programs compiled against would still work so long as there's a on your system - it would still point to the latest version that's compatible with version 6, and any program that wants to use version 7 is free to do so. And then by updating the symlink, all your compilations will be done against the newer version of the library.

An example

So, let's look at small do-it-yourself example. We'll use from my last post.

The first thing we need to do is compile is our library, and I assume you have func1.o and func2.o handy from the last post. We'll compile version 1.0.1 of libexample, and we'll tell the dynamic linker to use version 1, which will then be a symlink to the latest version 1 of the library. So delete if it's still in the current directory, and:

gcc func1.o func2.o -o -shared -Wl, -fPIC
$ ln -sv
`' -> `'
$ ln -sv
`' -> `'
$ ls*
lrwxrwxrwx 1 john john 15 2010-03-23 21:54 ->*
lrwxrwxrwx 1 john john 19 2010-03-23 21:54 ->*
-rwxr-xr-x 1 john john 6.7K 2010-03-23 21:39*

We now have our chain of shared libraries we need in order to use our versioning scheme. When we compile an application against the library, we tell it to look for in the current directory:

$ gcc program.c -o program-shared -I. -L. -lexample

However, if we look at the dynamic dependencies of program-shared:

$ ldd program-shared => (0xb78d5000) => not found => /lib/tls/i686/cmov/ (0xb776f000)
/lib/ (0xb78d6000)

We see it depends on - there's no mention of the more general or the more specific So if we make an incompatible, there's no problem - we redefine the symlink to point to so that any new compilations take place against our updated library, but old programs compiled against will still be able to find it, and will still work. Plus, if we update libexample.1.0.1 to, we can simply redefine the symlink libexample.1 to point to libexample.1.0.2, and programs using the old library will use new bugfixes.

How Linux development packages are arranged

So, now that you've got the general principle of this method, understanding Linux development packages should be a cinch.

When a user gets the standard package, they get all the versioned library files. The Debian libssh-2 package here, for example, will give you the following shared library files:

  • /usr/lib/

  • /usr/lib/

While the development package just gives you:

  • /usr/lib/

Along with all the header files you need and a static version of the library.

My next post will cover the behaviour of the dynamic linker (including how you can change it) such as where it searches for run-time libraries and when it resolves undefined symbols, as well as how you can use your own custom functions as a "front-end" to your system's library functions.

* I couldn't think of a better name for it, and it's normally just called "the linker", but that would just be confusing. Oh well.

Sunday, 20 December 2009

Libraries - Your Own and The System's

This post will be all about libraries. First, we'll talk about building and using our own (albeit very simple) shared and static libraries, and then about system libraries, including the standard C library as well as libraries that most Linux distributions will supply in development packages.

Looking at shared libraries will also lead us to play with the dynamic linker, and then in the next post, we'll build on this one to do some funky stuff, like provide wrappers to standard functions to count how many times they're called.


Libraries - both shared and static - are basically just collections of object files. So to make a library, we need to start with some object files. We'll also need a header file so that our test program we create knows how to call whatever functions we provide, so we'll start with that. Our library will consist of two functions, both of which will simply print a string to stdout:

// example.h

void func1(void);
void func2(void);

And each of our library's functions are defined in a separate source file:

// func1.c

#include "example.h"
#include <stdio.h>

void func1(void)
puts("in func1");

// func2.c

#include "example.h"
#include <stdio.h>

void func2(void)
puts("in func2");

We'll also have a sample program, that'll simply use the information in the header file to call our two functions:

// program.c

#include <example.h>

int main(void)

return 0;

Note that we've used #include <example.h> instead of #include "example.h", which is more akin to how we'd use our library if we were a "real" user using a "real" library.

Now we'll compile our source code into object files using the magic of separate compilation and gcc's -c option to make gcc compile without linking, which won't work anyway, as there is no main() function in our library (if you're not sure what I mean by "compile but don't link", see my previous posts, here and here). Anyways, to compile our object files:

$ gcc func1.c -c -o func1.o
$ gcc func2.c -c -o func2.o

And we now have the two object files we need in order to build our library. Note that if we just try and use gcc to compile our two object files, we'll get an error:

$ gcc func1.o func2.o
/usr/lib/gcc/i486-linux-gnu/4.4.1/../../../../lib/crt1.o: In function `_start':
/build/buildd/eglibc-2.10.1/csu/../sysdeps/i386/elf/start.S:115: undefined reference to `main'
collect2: ld returned 1 exit status

This is a classic linker error - the object file crt1.o has an undefined reference to a symbol called main, which the application programmer is expected to provide. We haven't, hence the error.

Anyways, we'll now get on with making our libraries.

Making the static library

Static libraries are easier to deal with than shared libraries, so we'll look at them first. A static library for a library called "example" will usually be called libexample.a. Most *NIX linkers (and the GNU linker) will automatically look for static libraries named with a lib*.a format.

To actually create our static library, we can use ar, which is part of the binutils package. It allows you to do many things with archives, but we're interested in its ability to insert/replace object files into them, so we'll specify the r option. We also want to create the archive, so we'll pass the c option. Apart from that, we just need to specify the name of the library we want to create (libexample.a) and the object files you want to insert into it (func1.o and func2.o) like so:

$ ar rc libexample.a func1.o func2.o

And we can see the contents of the archive with the t option:

$ ar t libexample.a

So, we've got our shared library. Let's make use of it in the aforementioned program.c. To do this, we'll have to tell gcc to find our include file in the current directory with -I., and we'll also have to tell the linker to look for libexample.a with -lexample (the linker assumes the "lib" prefix and ".a" or ".so" suffix), and to look for our library in the current directory with -L.:

$ gcc program.c -o program-static -I. -L. -lexample
$ ./program-static
in func1
in func2
$ ldd program-static => (0xb776d000) => /lib/tls/i686/cmov/ (0xb760c000)
/lib/ (0xb776e000)

Note that, by default, gcc will look to use shared libraries (with a .so suffix), but since it we don't have one it'll use our static one. You'll notice that the generated executable has a dynamic dependency on, for example, the system C library, but not on anything like libexample - our program has extracted all the relevant code from our libexample.a, and so doesn't need it from now on.

That's about all there is to static libraries - you bung object files in an archive file, then link to them at link-time, and that's it. There's a bit more to shared libraries, since you also need them present (and locatable...) at run-time.

Making the shared library

Now we come to making the shared library, for which we use gcc. The easiest way to do this is by passing gcc the -shared flag. Some architectures also require you to pass -fPIC, to tell the compiler to generate position independent code - since your library will be loaded into memory alongside an executable and other libraries, it needs to be able to be placed anywhere in memory, so this flag stops the compiler from hard-coding addresses into the library:

gcc func1.o func2.o -o -shared -fPIC

Now, if we give the same command to compile program.c as we did before, gcc will prefer the shared library over the static one:

$ gcc program.c -o program-shared -I. -L. -lexample
$ ldd program-shared => (0xb783c000) => not found => /lib/tls/i686/cmov/ (0xb76db000)
/lib/ (0xb783d000)

You can see that program-shared does indeed depend on You might have noticed the "not found" text, and indeed if you try and run our program, you'll get an error:

$ ./program-shared
./program-shared: error while loading shared libraries: cannot open shared object file: No such file or directory

This is an issue with the dynamic linker, /lib/ on most current Linux systems. When we run our executable, it has to resolve each shared library dependency in order to find the code it needs to run. It will look in certain standard directories, like /lib/ and /usr/lib, but none of those contain a file named that it can look in to resolve symbols.

Luckily, there are several ways we can resolve this - by passing an -rpath option at link-time, by adding an entry to /etc/, or by using the environment variable LD_LIBRARY_PATH. We'll be using the last one, although usually you'd prefer one of the first two, particularly using, but LD_LIBRARY_PATH is more explicit.

Basically, we can give the linker extra locations in which to look for shared libraries by adding them to a ':'-delimited list of paths to LD_LIBRARY_PATH, much like the PATH environment tells bash where to look for executables. So, we can run our program by setting this to the present working direcotory:

$ LD_LIBRARY_PATH=`pwd` ./program-shared
in func1
in func2

We can also demonstrate the magic of shared libraries, by updating and not having to touch program-shared:

$ sed -i ' func2.In new, improved func2!.' func2.c
$ gcc func2.c -c -o func2.o
$ gcc func1.o func2.o -o -shared -fPIC
$ LD_LIBRARY_PATH=`pwd` ./program-shared
in func1
In new, improved func2!

Okay - the next post will deal with system libraries, and some more funky stuff.

Thursday, 15 October 2009

The Compilation Chain - Part 2

Hi there. In this post we'll be taking a look at how object files are generated from the assembler that gcc outputs, and how they're linked to produce executables and libraries. We'll be taking a look at what an object file actually is, and what the linker has to do to produce an executable file or a library. If you've not read my last post, which was about the first two stages of turning a source file into an executable, now might be a good time, but it's not essential reading if you only want to know about object files.


From the assembler file we generated in the last post, hello.s, we can use GNU as (from the binutils package) to assemble it:

$ as -o hello.o hello.s

Alternatively, you get gcc to produce an object file by passing it the -c option, which stops gcc from trying to link the object file to produce an executable. This is essential if none of the files you're compiling contain a main() function.

Okay, so now we have an object file, what can we do with it? Well, there are a couple of things you can do, but by far the most useful for "general" programming is to query the symbol table. This is incredibly useful for resolving linker errors involving symbols that aren't defined, or are defined more than once. nm is a useful program included in binutils for looking at symbol tables, and is fairly simple - objdump is more heavyweight if you want to do more, and can even disassemble object files back into assembler. For our purposes, nm will do just fine.

If you simply run nm with our object as an input, you'll get a list of non-debugging symbols:

$ nm hello.o
00000000 T main
U puts

Note that if you want to use nm to look at shared libraries, you have to pass it the --dynamic flag or it'll just complain that there are no symbols.

Symbols mark the position of certain things in code, so you'll generally have a symbol for every function you define or use, and any variables that are either defined outside of any function, or declared static within a function - in short, any variable that's not an automatic variable, since automatic variables exist on the stack. For more detail, I'll be explaining about the stack, function calls and return values in a later post.

The output of nm above has three columns, which show (1) the value of the symbol, (2) the type of symbol, and (3) the name of the symbol. As you can see, the main symbol, which marks the start of our main() function, has the value 0 and type T, meaning it is a symbol that is defined in hello.o at the start of the .text section. Object files can contain many sections, the most important are:

  • The read-only .text section, which contains our program code,

  • The .data section, which contains data that's initialised by the programmer

  • The .bss section, which contains data is zeroed-out before entry to our main() function, and

  • The .rodata section which also contains data and is (can you guess?) read-only.

The other symbol nm displays is puts which doesn't have a value, because it's an undefined symbol, as indicated by the type, U. This brings us nicely to the most important job of the linker - resolving undefined symbols.

What a Linker Does

Many of you will know what linking is at a purely functional level, but when I was learning to program, the books I picked up never went into much detail - so first we'll have a brief description of what linking actually is before going into more detail later.

Basically, the job of a linker is to produce an executable or library file, whilst allowing you to import pre-compiled executable code from other libraries. Library files are located in various places on a Linux system, but the most important places are /lib/ and /usr/lib/. All libraries contain some code and/or data associated with a particular symbol - for example, a library file for the C standard library will mark some code and data for use with the puts function.

There are two types of linking and two different kinds of linker. The two types of linking are (1) static linking and (2) dynamic linking. The difference between them is when your program actually gets to see the code and/or data you need. When you link statically, the linker grabs the code from a static library (suffixed with .a) and copies it directly into the executable file you get at the end of linking. This way, all undefined symbols in your object file have been resolved, and your executable always has all the code it needs to run, and always runs the same code, wherever it's run from.

When you link dynamically, no code gets imported into your executable. Instead, the linker program (i.e. the compile-time linker, the canonical one being ld on a Linux system) simply tags the executable with the name of the shared library (suffixed with .so) it wants to use when it comes to run-time. This brings us to the other type of linker, the dynamic linker - on Linux systems at the time of writing, this will be /lib/*. So, when you link dynamically, your program doesn't actually get the code it needs until run-time, and this code is loaded into memory each time your program starts, when the dynamic linker resolves all the undefined symbols in your executable. This way, your program can run code that wasn't even written when you linked your object files - if the local* library is updated for, say, a bug fix, your program will use the new code. It also greatly reduces the size of your code - if every program had to have its very own copy of the printf() function, as well as many other functions, the executables on your system would be much larger and more bloated. On the down-side, the user must have a shared library with the correct name (shared library versioning is touched on later in this post, but more in-depth in the next one) and with the appropriate symbols defined, or the program can't run.

Actually Linking

So, for our present purposes, the linker has to produce an executable from our object file. To do this, we can use gcc to pass some necessary options to the linker, ld, including the -static flag if we want to produce a static executable:

$ gcc -o hello-dynamic hello.o
$ gcc -o hello-static hello.o -static
$ ./hello-dynamic
hello world
$ ./hello-static
hello world
$ du -h hello-*
12K hello-dynamic
576K hello-static

In producing our executable, the linker has had to resolve any undefined symbols in our object file(s) - as we saw before, the only undefined symbol is puts. The linker does this by looking in certain library files to see if those libraries define the symbol it's after. The symbol can then be resolved either statically or dynamically, depending on what files the linker finds in it's search path and the options you pass on the command line to gcc. Generally, the linker will prefer dynamic linking, unless (1) it only finds a static library file, or (2) you specify the -static option.

The linker will insist on being able to resolve all undefined symbols, either statically or dynamically, before producing your executable program, or you'll get an error.

In our statically linked program, the linker has copied all the code and/or data associated with the puts from the static library into our executable. As you can see, this makes our executable quite a bit larger than the dynamically linked one. In the dynamic version, the linker marks the symbol as undefined in the executable, and also stores the name of the shared library where the symbol was found into the executable. When the program is started, the dynamic linker ensures that these libraries are all present in it's search path, or it will refuse to let the program start.

Let's have a look at the symbol table for our dynamic executable:

$ nm hello-dynamic
08049f20 d _DYNAMIC
080484ac R _IO_stdin_used
w _Jv_RegisterClasses
08049f10 d __CTOR_END__
08049f0c d __CTOR_LIST__
08049f18 D __DTOR_END__
08049f14 d __DTOR_LIST__
080484bc r __FRAME_END__
08049f1c d __JCR_END__
08049f1c d __JCR_LIST__
0804a014 A __bss_start
0804a00c D __data_start
08048460 t __do_global_ctors_aux
08048340 t __do_global_dtors_aux
0804a010 D __dso_handle
w __gmon_start__
0804845a T __i686.get_pc_thunk.bx
08049f0c d __init_array_end
08049f0c d __init_array_start
080483f0 T __libc_csu_fini
08048400 T __libc_csu_init
U __libc_start_main@@GLIBC_2.0
0804a014 A _edata
0804a01c A _end
0804848c T _fini
080484a8 R _fp_hw
08048294 T _init
08048310 T _start
0804a014 b completed.6635
0804a00c W data_start
0804a018 b dtor_idx.6637
080483a0 t frame_dummy
080483c4 T main
U puts@@GLIBC_2.0

As you can see, there are quite a few more symbols defined in our executable than in our object file. This is because our program has been linked with several other files that provide code that runs before and after our main() function, as we'll see in a little bit. You'll notice there's a symbol _start defined in our executable - that's the real entry point to our program (again, more on that later).

If you really want to look at the symbols in hello-static, feel free - but because the entire puts function has been linked in, there are 1995 symbols, so I'll spare you a print out of them all.

You'll also notice the symbol puts in our dynamic executable has been changed to puts@@GLIBC_2.0. This is an example of symbol versioning - when we run our program, the dynamic linker will insist on seeing a shared library with the appropriate version string (in this case GLIBC_2.0), or it will simply print an error. We can demonstrate this by using sed to change the version string in our executable:

$ ./hello-dynamic
hello world
$ sed 's|GLIBC_2\.0|__xyzzy__|g' hello-dynamic > hello-bad
$ chmod u+x hello-bad
$ ./hello-bad
./hello: /lib/tls/i686/cmov/ version `__xyzzy__' not found (required by ./hello)

(Note that you may need to replace GLIBC_2.0 with the appropriate version string if it's different, and the string you're replacing it with has to be the same length as the version string, or the hard-coded addresses in the executable will point to garbage and "bad" will happen.)

As mentioned before, the linker tags the produced executable with the names of any shared libraries it needs at run-time. Actually, that's not quite true - it tags it with the names of any libraries you specify, even if it doesn't need any symbols from them, unless you pass the -as-needed flag to the linker before specifying the library with the -l flag. Anyway, it can be useful to extract this information, which we can do with the ldd program, which outputs the name of the library and the absolute path to where the library was found on the current system at the time the ldd program is run, if appropriate:

$ ldd hello => (0xb779b000) => /lib/tls/i686/cmov/ (0xb7622000)
/lib/ (0xb779c000)

As you can see, the executable we've produced relies on, and /lib/ is a "virtual" shared library, which is exposed by the Linux kernel as a gateway between kernel- and user-space. is the standard C library, and /lib/ is the dynamic linker responsible for loading the above libraries at run-time.

Dynamic linking is by far the most widely used form of linking, so that's what we'll concentrate on in the next secion.

Using ld Directly

It can be informative to see how to directly use ld itself to produce an executable - it won't use any files we don't tell it to, so we'll have to point it to everything we want to include in our executable.

First, we'll use the linker to produce a dynamic executable without any initialization. To do this, we'll need to change our basic "hello world" program a bit. As mentioned earlier, the _start symbol is the "real" entry point into a program - gcc supplies code that provides this symbol and eventually calls our main function, but because we're not using any startup files, we'll have to provide this symbol ourselves. Secondly, we have to explicitly call exit(), as the normal function return instruction ret is not suitable for exiting a program - we need the exit() function to tell the kernel to terminate the program, or the ret function will leave us trying to execute code we're not allowed to execute, and we'll get a segfault.

So, to generate our new program source and the object file we need:

$ cat > skel.c << 'EOF'
#include <stdio.h>
#include <stdlib.h>
int _start()
puts("hello world");
$ gcc -c -o skel.o skel.c

Note that on some architectures (Mac OS X for example), an underscore is automatically prepended to symbol names - in this case, the function above should be called start() instead of _start(). If in doubt, use nm to see if your compiler does this.

Now, when linking, we need to specify two more things than usual. Firstly, we need to explicitly link with by passing the option -lc. Secondly, we need to specify the dynamic linker our executable should use - on a Linux system at the time of writing, this will generally be /lib/ If you're unsure, use the ldd program on our hello executable to see what it uses.

So, to link our skel.o object file:

$ ld -o skel -lc -dynamic-linker /lib/ skel.o
$ ./skel
hello world

If we want to "hand-link" our original hello-dynamic executable, we need to tell the linker where to find the start files gcc provides. We need to provide a system-dependent -L option which contains the libgcc.a and libraries. This path should be in /usr/lib/gcc/<some>/<thing>/ - on my system, it's /usr/lib/gcc/i486-linux-gnu/4.3/.

The full command to link our hello-dynamic program ourselves is:

$ ld -dynamic-linker /lib/                 \
-o hello-dynamic \
/usr/lib/crt{1,i,n}.o \
/usr/lib/gcc/i486-linux-gnu/4.3.3/crt{begin,end}.o \
-L/usr/lib/gcc/i486-linux-gnu/4.3.3 \
-lgcc -lgcc_s -lc hello.o

$ ./hello-dynamic
hello world

I'll leave things there for just now - next time I'll talk about creating static and shared libraries, shared library versioning, and so on.

Wednesday, 14 October 2009

The Compilation Chain - Part 1

Okay, this is the first technical post of this blog, so we'll start off with a thorough overview of the compilation chain in the C language using GNU tools.

Note: To run all the commands in this post, you'll need to have binutils and the GNU C compiler installed, or a compatible toolchain.

When most people say "compilation" they mean getting an executable file from a source file - and most of the time that's all we want to care about. There is, however, a lot more to it than that, especially with the GNU toolchain.

To illustrate, let's see the steps that occur when you compile a simple hello world program. Not that you need it, but just for reference:

$ cat > hello.c << 'EOF'

#include <stdio.h>

int main()
puts("hello world");


Now, generating an executable straight from the source is fairly easy:

$ gcc hello.c -o hello
$ ./hello
hello world

The command gcc produces the executable hello (as specified by the -o hello option) from the source file hello.c. Surely there can't be much more to it? Well, there are four discrete steps that gcc uses to produce an executable file. They are:

  • Preprocessing the C source file using the cpp program,

  • Compiling the processed C source into assembler using the cc1 back-end,

  • Assembling the asm file into an object file using as, and finally

  • Linking the object file with other archives/libraries to produce an executable using the collect2 program, which is essentially a front-end to ld for simple programs

From the above, it might seem that the gcc program doesn't actually do anything that could be described as "compiling" at all - and you'd be right. gcc itself simply acts as a front-end to the above four operations. And what with gcc being the flexible beast that it is, you can get it to stop at any of those stages if you want to.


First, let's get gcc to show us our source code after it's been run though the C preprocessor cpp:

$ gcc -o hello.i hello.c -E

The -E option tells gcc to stop after it's finished running the preprocessor. Alternatively, you could have just run the cpp program directly, with the same options as above.

Take a look at hello.i - it's our original hello.c file, except all the preprocessor directives (i.e. everything that starts with a '#') like #include and #define have been resolved. Most of the code is from the #include <stdio.h> statement in our original file, since all this directive does is simply start reading from the specified file and put it into our source. If you want to see our contribution, you have to go right to the last few lines of the file.

This is, of course, incredibly helpful if you want to make sure your macros expand correctly, or if you have problems with missing definitions you're sure should be in a certain header file - the thing you see could easily have been #undef'd out in a file included from a file, or not included because of some obscure #if statement you're not sure's true or not.

Compilation Proper

By "compilation proper" I mean the translation from our source language (C) to our target language for this stage (assembly language). For those who aren't familiar with assembly language, also called assembler or asm, it's a very low-level language, only one step up from machine language. Each assembly language instruction corresponds directly to a single machine instruction, and deals directly with hardware registers, instruction pointers and so on. It also exposes the bare symbols in your program, as we'll see in a bit.

We'll take the preprocessed source and compile it to assembly by passing the -S directive to gcc:

$ gcc -S hello.i

Now you'll have a file hello.s in the current directory, containing the generated assember. There are many assembly languages for different machine architectures, so how the assembler looks will depend on the architecture you're compiling for, but here's the listing of the code generated for my x86 machine:
.string "hello world"
.globl main
.type main, @function
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $20, %esp
movl $.LC0, (%esp)
call puts
addl $20, %esp
popl %ecx
popl %ebp
leal -4(%ecx), %esp
.size main, .-main
.ident "GCC: (Ubuntu 4.3.3-5ubuntu4) 4.3.3"
.section .note.GNU-stack,"",@progbits

If you're not interested in knowing a bit about the assembler, skip to the next section.

Note that the output is in AT&T syntax - this might look strange if you're used to intel syntax. One important difference between the two syntaxes is that operands go the other way - for example, the instruction movl %esp, %ebp moves data from the esp register to the ebp register.

Anyway, let's have a look at some of the highlights of the code above that'll help solidify certain things later on - first off, the first five lines aren't instructions, they're assembler directives. Then we come to the line "main:" that looks like a C-style label. It looks like that because that's basically what it is - it simply marks the location of the next instruction. As it happens, it marks the start of our main function, and it will eventually become a symbol in the object file we generate. When any function is called, execution simply jumps to the location of the relevant symbol, and that's all there is to a function call - anything else (such as passing arguments or receiving a return value) has to be coded in assembler.

There will be more on how arguments are passed to functions and how return values are generated in some future post, but we'll just leave that there for now.

So, when our main() function is called, and some instructions execute, until we get to the "money instruction":

call puts

This instruction moves the address of the symbol puts into a register called the instruction pointer (IP) register (it also pushes the current value of the IP register on the stack - more about that in a later post), which does exactly what it says on the tin - it points to the next instruction the processor should execute. Since the location of the puts function has been placed there, execution will jump to that function and obligingly print our message. When it returns, execution starts at the would-be next instruction (i.e. addl $20, %esp) and continues until we hit the ret instruction near the end of the listing. The last three lines are more directives.

So there we go - our assembler file, ready to be assembled into an object file.

Assembly and Linking

Well, I've written more than I suspected I would for the previous sections, and there's even more to write on assembling an object file and linking it to produce an executable, so I'll leave that for my next post. I'll also discuss how to generate, inspect and strip (ooh-er) objects and shared and static libraries.

I hope someone out there finds this at least mildly useful - any (constructive!) comments are appreciated.


Hi there, and welcome to this weblog, which is all about the internals of programming on a GNU/Linux system. There's already a wealth of material on how to program in C/C++, and of course on Linux - so that's not what this blog is about.

When I started programming in C, there was lots of information on the syntax of the language, and so on, but information was stark on things like the compile-assemble-link process, let alone what a shared/static library actually is and how to make one, and so on. All programming books gave me was just enough to compile a program ("you need to include both example1.c and example2.c on the command-line"), but I wanted a much more in-depth understanding, and that's something I had to scrape together and work out for myself.

So this blog is about the compilation chain, linkage, object files, symbol tables, and how to really use tools like ld, ar, nm, objdump, strip and so forth to program in a GNU/Linux environment. I managed to get by for a while without really understanding any of this, but there's only so far you can get without knowing how these things work.

In short: This blog is about all the things I wish someone had told me about programming when I started.

Hopefully you'll find some use for this, or at least find it interesting.

Thanks for reading!