Almost every programmer knowns C language since it’s usually the first programming language they learned. However, C is very hard to gain a deeper understanding because of its low-level property. In this article, we will focus on the
extern keyword and find out its essence. Before we explain
extern keyword, we have to understand some basic concepts. OK, let’s begin.
The text of the program is kept in source files which we also call them units. A source file together with all the headers and source files included via the preprocessing directive
#include is known as a preprocessing translation unit.
#include to include
snippet.c, so those three files constitute a preprocessing translation unit.
When the compiler compiles
example.c, the first step is
preprocessing which consists of expanding a source file to recursively replace all
#include directives with the literal file contents (usually header files, but possibly other source files), macro expansion of #define directives and conditional compilation of #ifdef directives. After preprocessing, a preprocessing translation unit is translated to a translation unit.
# Use gcc or cpp command to get the output of preprocessor.
example.i is the translation unit. A translation unit contains all necessary information about later compilation. If we feed the translation unit to
gcc, it will compile it to an object file.
gcc -c example.i -o example.o
example.i has contained all the necessary information, we don’t need other files. The output file
example.o is an object file corresponding to the translation unit
The next step is linking. The linker (such as
ld) will link all object files to a single executable file (or library file, another object file). The linking process performed by the linker involves symbol resolution, symbol relocation, etc.
ld example.o -o example
So you want to understand what the
linkage really is? I don’t want to cite the definitions from C99 standard because it’s too abstract to understand. If you have known the concept of translation unit, the meaning of linkage is very clear.
There are three kinds of linkage: external, internal, and none.
A variable declared outside any function block and defined without
static keyword have external linkage. This variable can be accessed by all the functions in all the translation unit of a program.
A variable declared outside any function block and defined with
static keyword have internal linkage. This variable can only be accessed by the functions in the same translation unit.
A variable declared within a block of code have internal linkage. For example, all variables defined in functions, curly braces and for loop parentheses have internal linkage.
The function declared without
static keyword have external linkage. This function can be called from all the functions in all the translation unit of a program.
The function declared with
static keyword have internal linkage. This function can only be called from the functions in the same translation unit.
Though a variable or a function that has external linkage can be accessed in other translation unit, you can’t access it directly because the compiler doesn’t know where this variable or function is defined. You have to use some way to tell the compiler that a variable or a function is defined in another translation unit. Then the compiler can compile it normally. This is what
extern does. When you want to use a variable or a function
that has external linkage outside its own translation unit, you have to declare it with
extern keyword before using it.
extern int a; /* declaration before using */
You may think
extern void sort(); is weird, because you always write
void sort(); without an explicit
extern keyword. These two forms are identical. C99 standard says:
The storage-class specifier, if any, in the declaration specifiers shall be either extern or static.
So though you don’t explicitly use an
extern keyword, the compiler still treat it as
extern void sort();.