How the Linux Kernel initcall Mechanism Works

Armed with all of the above information, we're now ready to understand how the Linux kernel's initcall mechanism works. In fact, if you've understood most of what has been said up to this point, you already understand how it works; you might want to stop reading now and explore it on your own!

When you write a Linux kernel device driver there is a simple template that you follow. Following this template, together with some entries into the build system, a user can compile your driver either into the kernel or as a loadable module. All drivers, when loaded, have an opportunity to run a one-time initialization function. After this function is called it will never be called again for the duration of the time your driver is loaded. If your driver is used as a module, this one-time initialization function will be called when the driver is loaded. If your driver is compiled into the kernel, this one-time function is called as the system boots up. Having a kernel that has a fair amount of memory used by functions that are called once as the machine is brought up and will never be called again is a considerable waste. Therefore the kernel developers have arranged it such that all this code is put into its own ELF segment which is then tossed away once the machine is up and running (and has passed the initialization phase).

Dumping a whole bunch of code into a separate segment at compile time is a nice idea, but how to you then call all those functions at run time? The functions aren't all the same length, and it wouldn't be a very productive idea to force them all to be! Therefore it isn't possible to step through the code segment, calling functions as you go along. Although the function definitions themselves aren't the same length, luckily pointers to functions are all the same length (on the same system) so we can therefore build a table of pointers to all the initialization functions to call and step through this table calling each one in turn. Since this table is also something that is only needed at initialization time it makes sense to also put the table of function pointers into its own segment so that it too can be reclaimed after the initialization phase is complete.

Notice that the above trick of putting the initialization code into one segment and the initialization function pointer call table into another segment (both of which can be released once the machine is up and running) is only used when a device driver is compiled into the kernel. If the device driver is compiled as a module then the initialization code is handled differently.

The decision as to whether to compile something into the kernel or as a module is made not at code-writing time by the device driver writer, but at kernel configuration and build time, sometimes by someone other than the device driver writer. It is important to try to use the same code for both situations, and it makes a lot of sense to make these things very easy to handle and code for the person writing the device driver. So how are these two situations handled? By writing a bunch of macros and getting the programmers to follow a template.

I have distilled the Linux device driver writing template for a very simple driver into the following code. I have found and expanded the macros for the situation where we want to create a driver that is built into the Linux kernel. Also note that if you want to write your own device drivers and are just learning, this is not what your code would look like at all since device drivers do not contain a main()! I wrote this code in such a way so that it uses the same ideas and roughly the same code as the kernel, but in such a way that it could be played with as a regular user as code that isn't a device driver.

/*
 *     AUTHOR: Trevor Woerner
 * START DATE: 14 August 2003 - 09:58:33 AM
 *   MODIFIED: 23 September 2003 - 12:33:55 AM
 *   FILENAME: kernelcalls.c
 *    PURPOSE: Demonstrates how code works that is meant to be
 *             compiled into the kernel.
 *
 * Copyright (C) 2003  Trevor Woerner
 */

#include <stdio.h>

typedef int (*initcall_t)(void);
extern initcall_t __initcall_start, __initcall_end;

#define __initcall(fn) \
static initcall_t __initcall_##fn __init_call = fn
#define __init_call     __attribute__ ((unused,__section__ ("function_ptrs")))
#define module_init(x)  __initcall(x);

#define __init __attribute__ ((__section__ ("code_segment")))

static int __init
my_init1 (void)
{
	printf ("my_init () #1\n");
	return 0;
}

static int __init
my_init2 (void)
{
	printf ("my_init () #2\n");
	return 0;
}

module_init (my_init1);
module_init (my_init2);

void
do_initcalls (void)
{
	initcall_t *call_p;

	call_p = &__initcall_start;
	do {
		fprintf (stderr, "call_p: %p\n", call_p);
		(*call_p)();
		++call_p;
	} while (call_p < &__initcall_end);
}

int
main (void)
{
	fprintf (stderr, "in main()\n");
	do_initcalls ();
	return 0;
}
      

Let's examine these #define's closely.

  1. module_init(x) (calls __initcall(fn))

    #define __initcall(fn) \
    static initcall_t __initcall_##fn __init_call = fn
    #define __init_call     __attribute__ ((unused,__section__ ("function_ptrs")))
    #define module_init(x)  __initcall(x);
             
    which is a macro that

    • takes a function name

    • it defines a variable whos name is the concatenation of the string "__initcall_" plus the function's name

    • of type initcall_t (i.e. a function pointer)

    • which has the attributes assigned to it from the expansion of the __init_call macro (which just basically says to put this object (a function pointer) into its own segment called function_ptrs)

    • which is assigned the value of the function's address

    This macro could be shortened to:
    #define module_init(fn) \
        static initcall_t __initcall_##fn __attribute__ ((section ("function_ptrs"))) = fn
             
    with no loss of generality (that I am aware of).

  2. __init

    #define __init __attribute__ ((__section__ ("code_segment")))
             
    is a macro that

    • tells the compiler to put all of these such objects into their own segment called code_segment

Compiling this code we get... an error:

[trevor]$ gcc -o kernelcalls kernelcalls.c 
/tmp/ccVwvr4P.o(.text+0x9): In function `do_initcalls':
: undefined reference to `__initcall_start'
/tmp/ccVwvr4P.o(.text+0x30): In function `do_initcalls':
: undefined reference to `__initcall_end'
collect2: ld returned 1 exit status
[trevor]$ 
      
Oh yea, that's right, there's that symbol that doesn't appear in any of the code anywhere, just in the linker script. That's what got me started on all this in the first place! A linker script is used to make this all work. To be honest I'm not sure why they don't take advantage of the fact that the GNU linker will give you those start and end symbols for free, but there's probably a good reason.

Trying to create a valid linker script by hand from scratch would be a nice exercise, but not something I have the time to investigate. So instead I'll get the linker to tell me what its default linker script is and modify that to generate my required linker script. You can get the default script by doing a gcc -Wl,--verbose at the command line, the output of which I have saved as linker.lds. You can find the contents of that default script in this section.

Following the lead of the kernel's linker scripts I have added the following lines to the linker script:

__initcall_start = .;
function_ptrs   : { *(function_ptrs) }
__initcall_end = .;
code_segment    : { *(code_segment) }
      

Which results in the following output:

[trevor]$ make
gcc -Tlinker.lds -o kernelcalls kernelcalls.c
[trevor]$ ./kernelcalls
in main()
call_p: 0x80482cc
my_init () #1
call_p: 0x80482d0
my_init () #2
[trevor]$
      
It works!

The objdump -t looks like:

08048274 g     F .init          00000000  _init
08048274 l    d  .init          00000000
0804828c l    d  .plt           00000000
0804829c       F *UND*          00000023  fprintf@@GLIBC_2.0
080482ac       F *UND*          000000fb  __libc_start_main@@GLIBC_2.0
080482bc       F *UND*          00000039  printf@@GLIBC_2.0
080482cc g       *ABS*          00000000  __initcall_start
080482cc l     O function_ptrs  00000004  __initcall_my_init1
080482cc l    d  function_ptrs  00000000
080482d0 l     O function_ptrs  00000004  __initcall_my_init2
080482d4 g       *ABS*          00000000  __initcall_end
080482d4 l     F code_segment   0000001d  my_init1
080482d4 l    d  code_segment   00000000
080482f1 l     F code_segment   0000001d  my_init2
08048310 g     F .text          00000000  _start
      

Noticed how if we re-arrange the following lines from the source:

module_init (my_init2);
module_init (my_init1);
      
the output becomes:
[trevor]$ make
gcc -Tlinker.lds -o kernelcalls kernelcalls.c
[trevor]$ ./kernelcalls
in main()
call_p: 0x80482cc
my_init () #2
call_p: 0x80482d0
my_init () #1
[trevor]$ 
      

08048274 g     F .init          00000000  _init
08048274 l    d  .init          00000000
0804828c l    d  .plt           00000000
0804829c       F *UND*          00000023  fprintf@@GLIBC_2.0
080482ac       F *UND*          000000fb  __libc_start_main@@GLIBC_2.0
080482bc       F *UND*          00000039  printf@@GLIBC_2.0
080482cc g       *ABS*          00000000  __initcall_start
080482cc l     O function_ptrs  00000004  __initcall_my_init2
080482cc l    d  function_ptrs  00000000
080482d0 l     O function_ptrs  00000004  __initcall_my_init1
080482d4 g       *ABS*          00000000  __initcall_end
080482d4 l     F code_segment   0000001d  my_init1
080482d4 l    d  code_segment   00000000
080482f1 l     F code_segment   0000001d  my_init2
08048310 g     F .text          00000000  _start