SoftwareSphere Home

C++ Physical Design

Chapter 3
Dependencies

This chapter concerns the different kinds of dependencies that exist among source code elements and among libraries in C/C++. Understanding dependencies is fundamental for making reusable libraries. If a library has a lot of dependencies towards other libraries we will not be able to reuse it in a different context. If two or more libraries are dependent on each other we just can’t reuse them separately, we are forced to reuse them in block so we should better join them into a single library.

The main problem caused by dependencies during software development is the increase of build times, because if a header file is modified all the cpp file that include it directly or indirectly must be recompiled.

3.1 File dependencies

A file dependency appears when a piece of code in a file uses a symbol which has been defined in another file. In C++ a symbol can be used in a point of the code only if the compiler already knows the definition of that symbol in that point, and the compiler reads files only once, so all the symbols used have to be declared before, in the same file or in other files that are inserted in the current file by means of the include directive.

Notice that by means of the extern keyword the C/C++ allows to use a symbol without a declaration, so there is also a kind of dependency that does not go through an include directive, however this keyword can be used only for global functions, variables or constants, not for classes.

Notice also that the forward declarations allow to use a class identifier for declaring pointers without having the complete declaration of that class, but when a cpp file is compiled the complete declaration is needed anyway.

We conclude that we can look just at the include directives as the way that creates the dependencies from a source file to a set of other files.

The C++ language allows to include in a source file almost any compiler readable file, but usually the included files are header files, which contains only declarations and definitions. In this book we suppose that include directives are always directed toward h files, not cpp files. Both cpp and h files can contain include directives.

A file dependency corresponds to an include directive: if a file a.cpp contains an include directive towards file b.h then we say that the first file has a direct dependency towards the second one. We can say that file a.cpp includes file b.h as another way of saying that it contains an include directive that points to b.h.

Source file dependencies are also the cause of the main type of dependency among libraries.

3.2 Library dependencies

There may be many kinds of library dependency. The first way we are considering is the source code dependency, which occurs when a source file of a library includes a header file belonging to another library, however this may happen in some different ways.

3.3 Direct dependencies

Let’s see the definition of direct dependency, or dependency of order zero:

A library A has a direct dependency toward a library B when a file of A includes one or more header files of B.

Notice that if a library A has a direct dependency towards a library B there could be many source files of A and header files of B which take part into this dependency, for this reason a direct dependency may be more or less strong.

We can identify two numbers involved in a direct dependency: the number of source files of A which include a header file of B, and the number of header files of B which are included by some source file of A. We could also consider for every source file of A the number of header files of B that it includes, and for every header file of B the number of source files of A that include it.

A direct dependency is often represented by a simple arrow in a graph, but that simple arrow hides a higher complexity, it can be considered as composed by many lines which go from some source files of A to some header files of B.

PIC
Figure 3.1: Example of direct dependency among libraries

3.4 Indirect dependencies

There are also dependencies of order greater than zero which are defined as follows:

A library A has a dependency of order n towards a library B when it exists a series of libraries C1, C2, ... Cn such that A has a direct dependency towards C1, Cn has a direct dependency towards B and each library of the series has a direct dependency towards the next in the series.

Notice that the indirect dependencies may pass through different files. For example a library A may be using some header file h1 of library B but some source file of B may be using some header file h2 of a library C without the library A knowing the existence of the C library. This kind of dependency is not detected by the C++ preprocessor because when we compile the A source files we never reach the C header files, which are included only by some B source file.

We can represent an indirect library dependency with a dashed arrow, however the dependency graphs usually show only the direct dependencies and leave the reader free to find the indirect dependencies when needed.

PIC
Figure 3.2: Example of indirect dependency among libraries
PIC
Figure 3.3: Indirect dependency through different files
PIC
Figure 3.4: Another example of indirect dependency

Finally we can define the general dependency which can be direct or indirect of any order:

A library A has a general dependency towards a library B when it has at least a dependency towards B of any order.

3.5 Reciprocal dependencies

A reciprocal dependency is the one between two libraries A and B which both have a direct dependency towards the other. When two libraries have a reciprocal dependency they can’t be used separately so it would be better if they were one single library. For this reason we always organize libraries to avoid reciprocal dependencies.

PIC
Figure 3.5: Example of reciprocal dependency of libraries

Of course the reciprocal dependency between two library may be caused by different files.

PIC
Figure 3.6: Another example of reciprocal dependency of libraries

3.6 Circular dependencies

There is a circular dependency when we can start from a library and return to it following the arrows of direct dependency. The reciprocal dependency is a special case of circular dependency when the length of the loop is just two.

Circular dependencies are even worst than reciprocal dependencies because within a circular dependency all the libraries touched by the loop are actually inseparable. In a circular dependency the libraries which participate to the loop are all indirectly dependent towards each other.

PIC
Figure 3.7: Example of circular dependency of libraries

In most systems it is better to completely avoid circular dependencies, this of course excludes reciprocal dependencies too.

3.7 Dependency graph

We can summarize all the information about the library dependencies using a graph. From graph theory we have many concepts and algorithms for treating this information. The dependency graph will be created with nodes for libraries and arcs for the direct dependencies. It will be a direct graph. Saying that the libraries have no circular dependencies means that the dependency graph has no loops.

In this graph we’ll have a set of nodes which has no entering arcs, they are libraries which aren’t used by any other library. There will be some nodes with no exiting arcs, they are libraries which do not use any other library. There may be nodes without entering or exiting arcs, they are insulated libraries, which are not used and do not use anything. Finally there will be nodes with at least one exiting arc and one entering arc, they are libraries which are at the same time used by something and user of something. Let’s introduce the following terms:

Notice that with this terminology the base libraries correspond to the leaves of the tree in the sense of the graph theory, and the terminal libraries correspond to the roots.

An insulated library is one that is not used, so it is like a terminal one, and does not uses any other library, so it is like a base. Of course a library which is not used by any other library will have some reason to exist, for example because we keep it there for future use in other projects, or because it is used only by the main function, or because it is used in another way that doesn’t go through header inclusions, like the explicit loading of DLLs by means of operating system API. The non include based dependencies will be examined further.

Base libraries are those which do not use any other library, so they can be built for first.

PIC
Figure 3.8: Example of a library dependency graph

3.8 Library levels

If the library dependency graph has no loops we can collect the base libraries in a set called the base level or level 0, then we can collect all the libraries which have dependency only towards the base libraries and we call this set the level 1, then we can collect the libraries which have dependencies only towards base libraries or level 1 libraries and we call this set the level 2 and so on. The basic character of a level is that it has no dependencies towards higher levels, it can only have dependencies towards lower levels.

With this definition the level n is made by libraries which have dependencies only towards libraries of levels lower than n.

Notice that the level n may have dependency towards levels 0, 1,...,n-1, not only towards level n-1. For this reason we don’t call them “layers” which could suggest that each layer depended only on the layer immediately below it.

PIC
Figure 3.9: Example of levels of libraries

3.9 Construction order of a set of libraries

If we have a set of libraries without circular dependencies we can arrange them in a sequence such that no element of the sequence depends on an element that follows it, and then we can build the whole set of libraries in this order.

To create the construction sequence we first group the libraries in levels, then we append to the list the libraries of every level starting from the root level. Libraries in a level do not depend on libraries of the same level so we can build the libraries of a level in any order.

The construction order tells which libraries have to be built before we start to build a certain library. If we want to rebuild a single library we have to be sure that all the libraries which precede it in the construction order have been already built.

PIC
Figure 3.10: Example of construction order of a set of libraries

3.10 Extracting dependencies

The only way to extract all the source code dependencies is to run the C/C++ preprocessor for every single source file of a library. This happens because of the conditional compilation directives that can make the compiler skip some lines of the source files, and thus some include directives. The dependency extractor has to maintain a symbol table and it also has to be able to calculate expressions that can appear in the conditional directives. Finally an include directive may contain a macro instead of a file specification so the program must also expand macros. We conclude that the dependency extractor is equivalent to the C/C++ preprocessor, but the dependency extractor can skip all the non directive lines.

The behavior of the dependency extractor is in general not the same each time it comes into the same header file, because the sets of symbols definitions may be different. The symbols may be defined in the source or header files or they could be provided from the external system.

If the dependency extraction was independent of the sequence of files traversed to arrive at a given header then it would be possible to skip the multiple analysis of the same header and the dependency graph for files could be generated in much less time.

If the dependency extractor has to parse every header each time it encounters it the parsing time becomes very long, even for fast machines, and it is not possible to embed the dependency extraction procedure in an interactive tool.

For example, in a Microsoft Visual Studio 6.0 environment, a source file which uses the MFC library will have typically 250 dependencies towards headers of the MFC, the C/C++ runtime library and the Windows SDK and this list requires about 2 seconds to be built. If a project contains an hundred files the time required for extracting the dependencies would be around several minutes, 250 files need 500 s which are more than 7 minutes (tests done on a Pentium 1000 MHz with 348 MB Ram).

It is possible to arrange a dependency extraction tool by modifying the code of a typical C/C++ preprocessor. If in addition we suppose that all the include directives are placed at the beginning of the files, the tool can stop parsing a file when it finds the first line that is not a preprocessor directive and the time required to extract the dependencies becomes enough short even for big libraries.

3.11 Construction of interdependent dynamic link libraries

Even if we prefer to organize libraries in levels and to not have circular dependencies among libraries the C/C++ linker can build Dynamic Link Libraries with reciprocal dependencies if we make it follow the right steps. This is possible because in the case of DLLs the linker creates two files, a lib file and a dll file, the first lists the symbols exported by the DLL and the second contains the compiled code. Having the lib file of a DLL the linker can compile another DLL even if the dll file of the first is still not available.

3.12 Non include based dependencies

There are other kinds of dependencies which are not related to C/C++ header inclusion. Most operating systems provide to C/C++ programs some global functions to load a DLL in the current process. These functions use the library file name as a parameter. If a library A loads a library B only by means of the operating system API the dependency remains hidden to the source code analysis, except if we analyze the function calls using the information about the operating system API.

This runtime dependencies must be treated explicitly in source code, most of the times the program can check if the needed libraries are present and provide a sort of error reporting without crashing.

The explicit loading of dynamic link libraries is a very good technique since it completely detaches a library from the libraries that use it. The disadvantages are that C++ classes are difficult to load in this way because of the decorated names created by the compiler and used by the linker for the class members. The indirect loading of libraries is used mostly with libraries that export a single function or a small bunch of functions. These functions are C style functions, i.e. they are like global functions, but they can’t have calls in the user module since they are not known at link time.