The same C program can be written in several ways. All of them may be correctly compiled and an executable created, but some of them are easier to understand than others when read by other people. The “Style guide” is a document that explains how C code must be written. This style changes from one institution to another, but in industrial environments, it is normal to require strict adherence to these rules. It follows an enumeration of the rules that we will require you to observe in this course. We will require that your code complies with all these requirements, thus, the sooner you read them and take them into account, the more time you save to achieve code that is easy to read and maintain.
But, Why are these rules need to be observed when writing programs? The reasons are easy to understand when put in the context of industrial size software projects. A few examples are shown next:
Application |
Description |
Lines of code |
Windows XP Operating Systems |
Complete operating system |
40 millions |
Linux Kernel |
Basic functionality of the operating system |
8.4 millions |
Subversion |
Version control system |
417.000 |
Google Chrome |
Web browser"> |
1.5 millions (C++) and 1.4 millions (C) |
PHP |
Scripting engine for dynamic web pages |
800.000 |
The Gimp |
Graphic Editor |
675.000 |
VLC Media Player |
Multimedia player |
341.000 (C), y 93.000 (C++) |
When a program is composed by several hundreds of thousands of lines, or even millions, it is very important to write code that is easy to read. If not, a huge effort (and therefore money) is needed to make the slightest modification.
You need to take into account that normally, code is written once, but read tens of times: to look for a problem, understand before changing, or to write other modules that interact with the rest of the program. The unwritten rule followed in industry is that the code is going to be constantly read by people that did not participate in its creation.
We describe next the rules that you must observe. They are all numbered to facilitate referencing them when you have to review the style of your code.
Variable Names
Variable, function and file names must be short, descriptive and concrete.
Good | Bad! |
struct tcp_header header; bool is_enabled; int parse_xml_file(FILE * file); void init_user_interface(void); list.c xml_parser.c math.c |
struct tcp_header b; bool tmp; int open_xml_file_and_get_content(FILE * f); void ui(void); types.h utils.c code3.c |
Variable and function names must be written in lower case and, if composed by several words, each word must be separated by the symbol “_” (underscore). There are other styles such as to use capital letters to separate words (style known as “CamelCase”). In this course, we will adopt the separation by underscore. To illustrate why this scheme is preferred, read the following two sentences:
IHadAGreenDollWithALargeTShirt
I_had_a_green_doll_with_a_large_t_shirt
Which of the two requires less effort when reading? In any case, if the standard library of the programming languages you are using is written in CamelCAse style (as it is the case for Java), then you must use that style for consistency. But this is not the case of C, thus, we will use the underscore separation.
The macros and constants must be written in upper case to distinguish them from variables and functions.
Constants and public enumerations must include a 3 or 4 character prefix to identify the module in which they are defined (a module is a set of data and functions contained in several files). This avoids conflicts between names in different modules. For example:
#define LST_MAX_SIZE 32 enum { MSG_CONNECT, MSG_ACK, MSG_DATA, MSG_RELEASE } message_type_t;
Code format
The code must be indented to represent the logic structure of a program. Tabulators must be used to indent, never white space. The indentation adopted in the course is of size 4 spaces. We recommend that you configure the source code editor so that the tabulator introduces the equivalent of 4 white spaces.
The reason to use tabulators instead of white space is because with these symbols each programmer may visualize the code with the most comfortable indentation. You only have to configure the editor to represent tabs with the desired depth level.
The curly braces must be placed following the Allman
standard also known as BSD, that is, in the line following an
if
or a while
(see examples in the rest of
this section).
A white space must be inserted before and after
operators such as comparison, assignment, etc. A while space must be
inserted also between the keywords (for
,
while
, if
, return
, etc.) and
the following expressions.
The content of a function must fit completely in a screen. There should be no need to scroll to see the complete code of a function, although some special cases can be considered. In any case, function body cannot exceed ever the space of two screens.
Lines must not exceed 80 characters in length. This policy simplifies the visualization of several files simultaneously on the screen. Two code fragments are shown, one correctly formatted and another one incorrectly formatted:
Good |
int db_sync(void) { int i, retval = 0, result = 0; for (i = 0; i < P_SIZE; i++) { if (param_info[i].dirty && param_info[i].sync_cb) { retval = param_info[i].sync_cb(i, param_db[i]); result |= retval; if (retval == 0) param_info[i].dirty = false; } else { LOG_WARNING(“No callback for param %d”, i); } } return result; } |
Bad! |
int db_sync(void) { int i, retval = 0, result = 0; for (i = 0; i < P_SIZE; i++){ if (param_info[i].dirty && param_info[i].sync_cb) { retval = param_info[i].sync_cb(i, param_db[i]); result |= retval; if (retval == 0) param_info[i].dirty = false; } else { LOG_WARNING(“No callback for param %d”, i); } } return result;} |
Use of the pre-processor
Macros must be used to define array sizes so that they are easy to read and modify. Macros are frequently used also for any other constant values in the code.
#define TIMEOUT_SECS 120 #define MAX_LINE_SIZE 80 char input_line[MAX_LINE_SIZE]; timer = set_timer(TIMEOUT_SECS);
In the case of arrays, the reason is simple: initially, C does not allow to use a variable as the size of an array. Thus, the only way to use constants for that purpose is by using the preprocessor.
Comments in the code
All functions defined in a “.c” file,
both public or private (static
) must include a comment
at the top explaining in a line or two their purpose. This comment
may include some remarks specific to the execution of the
function.
/* * db_sync() * Synchronizes the internal database with the firmware files by storing * modified parameters on permanent storage. * * Any parameter marked as "dirty" will be dumped by calling its associated * sync callback. */ int db_sync(void) { ... }
These comments are included so that a person that is exploring the code, can understand its purpose without the need to study in detail. The brief comment at the top of the function must rely itself in a descriptive function name.
Comments must be included in those code locations implementing non-trivial operations. It is highly unlikely to comment a single line of code. If the code is cleanly written, a single line should be self-explanatory.
Good | Bad! |
/* Call synchronization callback for parameters * marked “dirty”. Clear dirty flag if callback * succeeds. */ if (param_info[i].dirty && param_info[i].sync_cb) { retval = param_info[i].sync_cb(i, param_db[i]); result |= retval; if (retval == 0) { param_info[i].dirty = false; } } |
if (param_info[i].dirty && param_info[i].sync_cb) { /* Call sync_cb */ retval = param_info[i].sync_cb(i, param_db[i]); result |= retval; if (retval == 0) { /* Set dirty flag to false ª/ param_info[i].dirty = false; } } |
Code organization
The C code is organized in files with extensions
“.c” and “.h”. For each “.c”
file, there is usually a “.h” file with the same name
(list.c
, list.h
). This pair
of files is informally known as a module. The
“.c” files are not only files containing sets of
functions. The key to organize the code in several files and avoid
cross-referencing problems (the compiler complains because a symbol
or definition that is in a different file, is not known) is to
understand that what in other programming languages are called
objects in C are called modules (although with a
much simpler structure, of course).
Each module (or “quote” if preferred)
contains a set of functions, the prototypes of which make the public
interface defined in the “.h” file. The
“.c” file contains the implementation of these functions
and in some cases, additional variables and functions only accessible
from the same “.c” file. The prototypes of those
functions that are public (they can be used from outside the module)
are included in the “.h” file so that the rest of the
modules may use it by including at the top of the file the directive
#include
. The rest of the functions, the private ones,
are included only inside the “.c” file defined with the
static
prefix so that they cannot be invoked from
outside that file.
Each “.c” file must have at the top a
directive to include its corresponding “.h” file (for
example, list.c
must have #include
"list.h"
). This is done to avoid inconsistencies between the
definition of variables and prototypes in the public functions in the
“.c” file and the declaration of its prototype in the
“.h” file. If the corresponding “.h” is
included in the “.c” file, the compiler can detect type
conflicts.
The “.h” files must contain only public definitions: types, constants, global variables and prototypes of functions to be used outside the module. Everything else must be included in the “.c” file.
Every “.h” file must contain a guard to
prevent multiple inclusions. A guard is implemented by surrounding
the entire file content between #ifndef SYMBOL
and
#endif
. The symbol name must be unique for this file (it
is recommended to use the file name with underscores). After the line
with the #ifndef
directive a line with the
#define
directive must be included with the exact same
symbol used in the previous line. The content of the
“.h” file is inserted after this second line. It follows
an example of a guard implemented in a file with name
list.h
.
#ifndef _LIST_H
#define _LIST_H_
...
... content of file list.h ...
...
#endif /* _LIST_H_ */
The “.h” files must only contain the
minimum set of #include
directives to compile on their
own. The best way to test that a “.h” file includes only
the essential files is writing a main as the one shown in the
following figure:
#include "list.h" int main(void) { return 0; }
It is mandatory to define as static
all
private functions and variables (those that cannot be used in any
other location). All global variables in a module must be written at
the top in order to be seen at once. You should avoid defining them
spread over the file. As a consequence, it is highly recommended to
separate the definition of a data structure from the declaration of
global variables of such type. The data types are defined on one hand
(if they are public, in the “.h” file, if not, in the
“.c” file), and then the declarations of the global
variables that use those types.
Good | Bad! |
#include "param_db.h" struct param_entry { char param[PARAM_MAX_LEN]; char value[VALUE_MAX_LEN]; char default[VALUE_MAX_LEN]; }; struct param_list { struct param_entry param; struct list_node *next; }; /* Global Variables */ int error_code; static struct param_list *param_map; |
#include "param_db.h" struct param_entry { char param[PARAM_MAX_LEN]; char value[VALUE_MAX_LEN]; char default[VALUE_MAX_LEN]; }; static struct param_list { struct param_entry param; struct list_node *next; } *param_map; int error_code; |