The rule

The mere fact of including a given header file, without including any other beforehand, should not be enough of a reason for the build to break. This means that the header file should be self-contained, and for this to be the case, such header file has to pull in any dependencies that it explicitly requires (and no more).

Interestingly, note that this does not mean that a header file must include everything it may ever need to be fully usable.

The example

Let’s take a look at an example to give some meaning to the above. Consider a thelog library that implements functionality to write to a log file and provides various supporting data types. This library is composed of the following files:

  • thelog/delta.h: Defines the struct delta type. Includes no other header file.
  • thelog/log_entry.h: Defines the struct log_entry type and helper functions.
  • thelog/timestamp.h: Defines the struct timestamp type. Includes no other header file.
  • thelog/user.h: Defines the struct user type. Includes no other header file.

With these types in mind, the thelog/log_entry.h file could look like this:

#if !defined(THELOG_LOG_ENTRY_H)
#define THELOG_LOG_ENTRY_H

#include <thelog/user.h>
#include <thelog/timestamp.h>

struct log_entry {
    struct user caller;
    struct timestamp start_time;
    struct timestamp end_time;
};

struct delta;
struct delta* log_entry_length(const struct log_entry*);

#endif  /* !defined(THELOG_LOG_ENTRY_H) */

Take a close look at the highlighted types and compare those to the included header files. Notice anything? We are only including header files for 2 out of the 3 referenced types and using a forward declaration for the third. Why is that?

Simple. The struct user and struct timestamp data types are instantiated in this header file and determine the size of the log entry. In order to ever instantiate such a structure, you must know what all of its dependencies are, so for the build to work, you have to pull in their definitions which conveniently live in separate header files.

On the other hand, the prototype definition of log_entry_length does not require any knowledge of the specifics of its return type: knowing only that the type exists is enough to yield a valid function declaration, and thus a forward declaration is sufficient to fulfill the build requirements.

So what’s up with that struct delta? If the user ever calls the log_entry_length function, he will have to include thelog/delta.h on his own as a hidden dependency anyway and thus encapsulation is broken! False. The user will only need to include this other header file if he wishes to mess around with the internals of the type itself. And if he ever does that, then the user code gains a direct dependency on this other data type and thus the user code should be explicitly including the other header file anyway. See “Why Include What You Use?” for some details on this.

The exception

The exception? There should be none (in my opinion), but they exist. Traditionally, and unfortunately, many files under sys/ are not self-contained. This has been a common source of bikesheds in NetBSD (and possibly elsewhere) although I’m having trouble now finding a reference.

When you use symbols in any of these special header files, make sure to check the relevant manual pages to see which dependent header files you require. And… beware that these dependencies vary across operating systems, so you are in for a portability nightmare.

The trick

There is a very simple trick to routinely ensure your header files are self-contained.

In the past, I used very complex approaches to validate this which ranged from building one-liner source files with the inclusion of the desired header file to plugging these into ATF-based test cases. The boilerplate required to perform this validation was astonishing and a waste of time. As it turns out, there is a much simpler approach — and, to tell you the truth, writing about this little trick is what triggered this whole series on header files!

So what is it? Easy: include the header file for the module you are implementing at the very beginning of your module. That’s it! No more no less. For example: in the implementation of a module resources.c, make sure to include its corresponding resources.h before anything else. By doing this, you have one test case for every header file to ensure that the header file is self-contained. If any of your header files is not self-contained, the build will fail as soon as you compile the offending module.

In the case of header-only modules, you can achieve this same effect by following this trick in the test program corresponding the header file. Actually, if you have one test program per every module (which you do have, right?) you should apply the trick mentioned here both to the .c implementation and to the test program.