For most software development experts a software project is made of concepts, entities, abstractions, ideas, information, and all these things are completely immaterial and abstract, we can only speak about them, or describe them with diagrams.
Things like classes, objects, functions, variables, design patterns, class relations, are very important to understand how a software works, and how it can be modified or extended, but they are not the whole story.
A software in a programming language like C or C++ is still a collection of pieces of source code, written into files, stored into some media, with the constraints of a limited character set and a structure imposed by some file system, and the results of the compiling and linking processes is a set of binary files which are stored in the same file system, and which are connected in some way to each other and to other external files.
The physical design concerns how the source files and the executable files are organized, connected and stored.
In most programming courses and also in schools the organization of source code files receives a very short lesson time, they show you a simple project with at most a dozen of files, all placed in the same directory, they put the main effort in describing the abstract concepts that are needed to create software and so neglect the very serious problems of code organization that occur in large projects.
The quality of the physical design may appear during the development cycle or during software maintenance. One of the main aspects of a physical design is the easiness of the builds and the minimization of the build times, a good physical design is one that requires that only a few files are recompiled when a little change is made to the source code, or a new source file is added. In addition to this a physical design has many other aspects, like maintainability, extensibility, robustness and so on.
There are many different physical designs of course, however some of them are not so good as they may seem and some can really be worst than a "random" design.
Finally, the physical design is very important when we want to build C++ software for multiple platforms, using different compilers and possibly different file systems.
We need to go back to the roots, at the very beginning of software development, and ask again ourselves what is a program, what is the thing we are trying to build.
In today’s computers a program is a computer file or a set of files which can be used to create and run a process in the operating system. The operating systems provide all that is needed to execute a process, and the program provides the instructions to be executed. We can say that the true nature of a program is the process it can create in the computer.
From the software development perspective the program is just a file or a set of files. In C++ these files are made by binary instructions and data since they are compiled.
A program can be composed of a single executable file, this is the simplest way to build a program, however building large single piece executable files may slow down very much the development process, so large programs are usually built as a set of binary files, which are executables or dynamic link libraries.
We could make a large system by making many separate executable programs which create as many separate processes, the processes may be launched when they are needed, and we could use some way to communicate data from a process to another. This is a common way of software development in command-line environment like Unix. In Unix we have many separate programs which are designed to make something and produce an output file or stream, such that their output can be passed to another program as input. This is a very procedural way of designing systems, and of course it has slower performances due to the input-output from and into processes.
In this book we are interested in how to make a single process program which can be large and growing with time. All the data will flow inside the same process so it will be as fast as possible for a computer. With today computers that have large amount of RAM it is convenient to hold most of the data in memory. Of course the persistent data will always be stored in a kind of database, but the data that the program is using in a given moment will be in memory.
With C++ we can make many types of programs, for example procedural programs like the old C, or almost completely Object Oriented applications. The C++ programming language is very very flexible and can be used in many different ways, however for the development of large software we are restricting our selection to one single way of using the C++ language.
A C++ program for us is a collection of C++ classes, and, if the program is
an executable, a single global function called main which is the only
start point of the program. The meaning of the classes are the true content of the
program. Of course the main function will do something with the classes
contained in the program, for example it will create some objects, or call a
static member function of some class. Notice however that in C++,
because of the class constructors, something may happen even before
entering in the main functions, for example the construction of global
objects.
Our definition of a C++ program is very important because it opens the way to the program’s modularity. Since the program is just a collection of classes we can make the program grow by adding new classes, or by adding other sets of classes. A set of classes may not be an executable on its own but it can be considered a program.
Then we can define a C++ program as a set of sets of C++ classes. The classes of the same set may be connected to each other or they may be totally independent. Also the sets may be connected to each other or they may be independent.
For an easier reading we can call a set of classes a package or a
library. To avoid the confusion due to the meaning of the term package
in the Java programming language we will use mainly the term library. A
library is a collection of C++ classes. A program is a collection of libraries with
one global function called main.
We require also some other properties from the libraries, first of all a class can be in one and only one library, so it is not possible to have the same class in more libraries.
This concept of a program is what we need from the point of view of the source code management, but not all the classes belonging to the libraries will really be included in the final executable, actually the linker will insert in the final executable only the member functions that are used somewhere, but we can ignore this fact during the design phase.
Since we are making single process programs all the class symbols must exist in the same global identifiers space. The C++ standard allows us to define additional name spaces which are actually sub-spaces of the global name space, however this doesn’t change anything since the complete name of a symbol, including all its namespace names, must be unique in the program.
Since the complete names must be unique, C++ namespaces aren’t really needed, we can use a suitable naming convention to group the classes in some way.
C++ namespaces are useful to group classes and other parts of code. Of course we could try to create libraries which correspond to namespaces but this isn’t always possible, sometimes it is useful to extend a namespace that has been already defined in another library.
C++ namespaces actually don’t interfere with our concept of libraries or packages. A namespace can span more than one library, and a library can contain parts of more than one namespace.
After many large C++ projects we have found that maybe C++ namespaces are not so useful as they may seem, and maybe the prefix way of naming classes is better. This happens because namespaces are a logical concept, completely detached from the real modules that are libraries.
To remark the fact that a class or a file belongs to a certain library we can use a prefix in the name of the class or file. For example given a library called ABC every class of that library would have a name which starts with ABC_, and every source code file of that library would have name that starts with ABC_. This is a very robust way of defining libraries by means of the source code. Of course it is a bit ugly in source code when an instance of a class is declared, however that happens only for class identifiers while variables of that type, or pointers to that type, don’t need to have the prefix.
Example: for a set of libraries for a graphical interface we can have: Libraries: FON, BMP, DEV, FILE, UTIL Classes: FON_Font, FON_FontManager, BMP_Bitmap, BMP_BitmapLoader, BMP_MemoryBitmap, DEV_Device, DEV_ScreenDevice, DEV_MemoryDevice, FILE_TextFile, FILE_BinaryFile, UTIL_Log.
When using prefixes it is possible to recreate the library file list from the entire set of source files, so there is no need to maintain separate project files that list the files belonging to the different libraries. A build environment could provide a function to create automatically the projects on the base of source file name prefixes.
If we don’t want to use package prefixes in class names we have to use namespaces and be sure that each namespace contains just the classes of one library, and each library contains only that namespace. In other words we have to create a different namespace for each separate library. In this way two libraries may still contain classes with the same inner name.
Whether to use prefixes or namespaces is a matter of personal preference. Prefixes are shorter than namespaces and don’t need the scope resolution operator “::”.
Notice that C++ namespaces can be nested while libraries can not, this is an important difference to take into account if we want to use namespaces instead of name prefixes. If we associate namespaces to libraries then we may think that a library may contain sub-libraries but that is not possible.
Finally we can also avoid prefixes in class names if we accept to use all unique class names throughout the set of libraries we are designing.
Copyright 2009 SoftwareSphere