Sunday 20 December 2009

Libraries - Your Own and The System's

This post will be all about libraries. First, we'll talk about building and using our own (albeit very simple) shared and static libraries, and then about system libraries, including the standard C library as well as libraries that most Linux distributions will supply in development packages.

Looking at shared libraries will also lead us to play with the dynamic linker, and then in the next post, we'll build on this one to do some funky stuff, like provide wrappers to standard functions to count how many times they're called.


Preliminaries



Libraries - both shared and static - are basically just collections of object files. So to make a library, we need to start with some object files. We'll also need a header file so that our test program we create knows how to call whatever functions we provide, so we'll start with that. Our library will consist of two functions, both of which will simply print a string to stdout:


// example.h

void func1(void);
void func2(void);


And each of our library's functions are defined in a separate source file:


// func1.c

#include "example.h"
#include <stdio.h>

void func1(void)
{
puts("in func1");
}



// func2.c

#include "example.h"
#include <stdio.h>

void func2(void)
{
puts("in func2");
}


We'll also have a sample program, that'll simply use the information in the header file to call our two functions:


// program.c

#include <example.h>

int main(void)
{
func1();
func2();

return 0;
}


Note that we've used #include <example.h> instead of #include "example.h", which is more akin to how we'd use our library if we were a "real" user using a "real" library.

Now we'll compile our source code into object files using the magic of separate compilation and gcc's -c option to make gcc compile without linking, which won't work anyway, as there is no main() function in our library (if you're not sure what I mean by "compile but don't link", see my previous posts, here and here). Anyways, to compile our object files:


$ gcc func1.c -c -o func1.o
$ gcc func2.c -c -o func2.o


And we now have the two object files we need in order to build our library. Note that if we just try and use gcc to compile our two object files, we'll get an error:


$ gcc func1.o func2.o
/usr/lib/gcc/i486-linux-gnu/4.4.1/../../../../lib/crt1.o: In function `_start':
/build/buildd/eglibc-2.10.1/csu/../sysdeps/i386/elf/start.S:115: undefined reference to `main'
collect2: ld returned 1 exit status


This is a classic linker error - the object file crt1.o has an undefined reference to a symbol called main, which the application programmer is expected to provide. We haven't, hence the error.

Anyways, we'll now get on with making our libraries.


Making the static library



Static libraries are easier to deal with than shared libraries, so we'll look at them first. A static library for a library called "example" will usually be called libexample.a. Most *NIX linkers (and the GNU linker) will automatically look for static libraries named with a lib*.a format.

To actually create our static library, we can use ar, which is part of the binutils package. It allows you to do many things with archives, but we're interested in its ability to insert/replace object files into them, so we'll specify the r option. We also want to create the archive, so we'll pass the c option. Apart from that, we just need to specify the name of the library we want to create (libexample.a) and the object files you want to insert into it (func1.o and func2.o) like so:


$ ar rc libexample.a func1.o func2.o


And we can see the contents of the archive with the t option:


$ ar t libexample.a
func1.o
func2.o


So, we've got our shared library. Let's make use of it in the aforementioned program.c. To do this, we'll have to tell gcc to find our include file in the current directory with -I., and we'll also have to tell the linker to look for libexample.a with -lexample (the linker assumes the "lib" prefix and ".a" or ".so" suffix), and to look for our library in the current directory with -L.:


$ gcc program.c -o program-static -I. -L. -lexample
$ ./program-static
in func1
in func2
$ ldd program-static
linux-gate.so.1 => (0xb776d000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb760c000)
/lib/ld-linux.so.2 (0xb776e000)


Note that, by default, gcc will look to use shared libraries (with a .so suffix), but since it we don't have one it'll use our static one. You'll notice that the generated executable has a dynamic dependency on, for example, the system C library libc.so.6, but not on anything like libexample - our program has extracted all the relevant code from our libexample.a, and so doesn't need it from now on.

That's about all there is to static libraries - you bung object files in an archive file, then link to them at link-time, and that's it. There's a bit more to shared libraries, since you also need them present (and locatable...) at run-time.


Making the shared library



Now we come to making the shared library, for which we use gcc. The easiest way to do this is by passing gcc the -shared flag. Some architectures also require you to pass -fPIC, to tell the compiler to generate position independent code - since your library will be loaded into memory alongside an executable and other libraries, it needs to be able to be placed anywhere in memory, so this flag stops the compiler from hard-coding addresses into the library:


gcc func1.o func2.o -o libexample.so -shared -fPIC


Now, if we give the same command to compile program.c as we did before, gcc will prefer the shared library over the static one:


$ gcc program.c -o program-shared -I. -L. -lexample
$ ldd program-shared
linux-gate.so.1 => (0xb783c000)
libexample.so => not found
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb76db000)
/lib/ld-linux.so.2 (0xb783d000)


You can see that program-shared does indeed depend on libexample.so. You might have noticed the "not found" text, and indeed if you try and run our program, you'll get an error:


$ ./program-shared
./program-shared: error while loading shared libraries: libexample.so: cannot open shared object file: No such file or directory


This is an issue with the dynamic linker, /lib/ld-linux.so.2 on most current Linux systems. When we run our executable, it has to resolve each shared library dependency in order to find the code it needs to run. It will look in certain standard directories, like /lib/ and /usr/lib, but none of those contain a file named libexample.so that it can look in to resolve symbols.

Luckily, there are several ways we can resolve this - by passing an -rpath option at link-time, by adding an entry to /etc/ld.so.conf, or by using the environment variable LD_LIBRARY_PATH. We'll be using the last one, although usually you'd prefer one of the first two, particularly using ld.so.conf, but LD_LIBRARY_PATH is more explicit.

Basically, we can give the linker extra locations in which to look for shared libraries by adding them to a ':'-delimited list of paths to LD_LIBRARY_PATH, much like the PATH environment tells bash where to look for executables. So, we can run our program by setting this to the present working direcotory:


$ LD_LIBRARY_PATH=`pwd` ./program-shared
in func1
in func2


We can also demonstrate the magic of shared libraries, by updating libexample.so and not having to touch program-shared:


$ sed -i 's.in func2.In new, improved func2!.' func2.c
$ gcc func2.c -c -o func2.o
$ gcc func1.o func2.o -o libexample.so -shared -fPIC
$ LD_LIBRARY_PATH=`pwd` ./program-shared
in func1
In new, improved func2!


Okay - the next post will deal with system libraries, ld.so.conf/ld.so.cache and some more funky stuff.