Code style
My preferred code style is 2-space K&R. This is intended to provide a justification for this style.
K&R style has the following properties:
Symmetry in size is important because it provides a visual cue that two things are similar in effect/importance.
For example, in written text, lines of a paragraph are the same space apart. The space between paragraphs is larger, but also the same between each. The space between subsections, sections and chapters are progressively larger still. The space between things indicates their relatedness, the size of the headings indicate how large of a topic change they introduce. All line spaces are the same, all paragraph breaks are the same size, all chapter headings sizes are the same, etc.
Two subsequent statements in a program are related by one happening after the other, i.e. one directly inheriting the state of the previous. Within a control structure, the first statement inside the structure and the statement before the structure are less related than two sequential statements, and so the introducing line of the control structure increases the space between them, and specifies the different nature of their relationship. The finishing line has the opposite, but equivalent role. The opener transitions fully into a block and the closer transitions fully out. Their real effect on the meaning of the code above and below each of them is of the same magnitude, in opposite directions. In K&R, the space the opener and closer of a control structure occupy on the screen is equivalent, to reflect their equivalent importance in terms of effect on the meaning of the code above and below each.
Here is a comparison of brace styles found on Wikipedia:
Name | Example | Open/close equal height | Open/close equal indent | Open/close don't share lines with content |
---|---|---|---|---|
K&R | ✔ | ✔ | ✔ | |
Allman | ✘ | ✔ | ✔ | |
GNU | ✘ | ✘ | ✔ | |
Whitesmiths | ✘ | ✘ | ✔ | |
Horstmann | ✘ | ✔ | ✘ | |
Pico | ✘ | ✘ | ✘ | |
Ratliff | ✔ | ✘ | ✔ | |
Lisp | ✔ | ✘ | ✘ |
Notice the parallel that results between a stand-alone C (or a C-style language) block,
{
foo();
bar();
}
an if
block in C (or a C-style language) (K&R style),
if (x == y) {
foo();
bar();
}
a PHP block-style if
block,
if (x == y):
foo();
bar();
endif;
a Ruby if
block,
if x == y
foo()
bar()
end
an if
block in the Fish shell,
if x == y
foo
bar
end
some HTML/XML,
<ul style="...">
<li>foo</li>
<li>bar</li>
</ul>
and an array/object/dictionary literal (JSON, JavaScript, Python, PHP, Ruby...).
var things = [
'foo',
'bar',
];
var things = {
foo: 'foo',
bar: 'bar',
};
All of these follow a simple pattern:
(introducer)
(entries)
(finisher)
One line to open, one line to close.
The visual relationship between the opener of the block and it's contents is also the same as in languages which use the off-side rule, such as CoffeeScript, YAML, Python, Haskell and SASS:
if foo():
bar()
baz()
boo()
Except for the lack of a finishing line, since a dedicated closer is not needed. The block is closed by returning to the previous indent.
It is also the same as a control structure lacking braces (if permitted by your code style):
if (foo())
bar();
if (foo()) {
bar();
}
This means that the presence/absence of braces has a less dramatic effect on the code's layout. Compare these code samples, for example:
function typeName(obj) {
if (obj.isFoo())
return 'foo';
else if (obj.isBar())
return 'bar';
else if (obj.isBoo()) {
log_notice('Cannot get name of a boo');
return null;
} else
throw new Error();
}
function typeName(obj)
{
if (obj.isFoo())
return 'foo';
else if (obj.isBar())
return 'bar';
else if (obj.isBoo())
{
log_notice('Cannot get name of a boo');
return null;
}
else
throw new Error();
}
Code in K&R style has no enforced meaningless lines. Each line tells the reader something they wouldn't otherwise know, and they can therefore progress from one line to the next gathering new information at each. Blank lines are not bad per se, since they are useful to group related items together, and so hitting a blank line is a signpost, like a paragraph or section break in a book, that the old topic ends and a new topic starts, but whether a blank line is applicable in any given context is entirely dependant on that context, and so cannot be decided by the coding style. Under K&R, every blank line has a purpose, decided by the programmer, based on the context, to group related lines together.
Here is an example of the consequence of forced meaningless lines:
$foos = array();
foreach (getFoos() as $foo)
{
$foos[] = transformFoo($foo);
}
Under Allman style, a blank line separates the initialisation of $foos
and the loop header from the loop body. The blank line implies that they are unrelated, when in fact they are. The whole set of code belongs to a single topic of "array of transformed $foos" which the Allman style has artificially inserted a topic break inside of.
Under K&R, the whole code block can be properly treated as a single visual unit:
$foos = array();
foreach (getFoos() as $foo) {
$foos[] = transformFoo($foo);
}
K&R style minimises the amount of vertical space which the code consumes, while maintaining that the syntax of the control structure itself does not share lines with it's contents.
Ensuring the control structure doesn't share lines with it's content is important. Lines are the "boxes" or categories in which related syntax goes. In the case of a code block, all the syntax related to a given statement goes on the same line. This not only helps visual comprehension, but means line-wise operations (triple-click select line, delete line, cut line, duplicate line, source code diff etc.) are meaningful. With the Lisp-style bracing, for example, the last brace on the same line as the last statement means that the "delete line" operation cannot be used to delete the last statement, and adding a statement to the end of the block creates a "-1 lines +2 lines" diff instead of only "+1 lines".
Ensuring the code minimises the amount of vertical space is important for information density, i.e. the total amount of information that is readily available, per unit of screen space. Minimising vertical space (lines) used helps the reader to get a "birds eye" view of the code without using a smaller font and without removing the indents and purposeful blank lines that give the code a visual structure. The more code that can be fit on screen without compromising it's visual structure, the more readily the code can be read.
Consider you have two statements:
foo();
bar();
and you want to wrap them in a if
block wrapped by a for
loop. The K&R result is:
for (...; ...; ...) {
if (...) {
foo();
bar();
}
}
While the Allman result is:
for (...; ...; ...)
{
if (...)
{
foo();
bar();
}
}
Under K&R, the overhead in consumed lines for each control structure is 2. In Allman the overhead is 3, i.e. a 50% higher cost in vertical space.
This is a survey of well known software projects/companies and their brace/indent style.
Note that some codebases use K&R style for control structures but Allman style for classes and functions. These have been grouped under "K&R".
Company/Project | Brace Style | Indent Type | Indent Size | Spaces inside( ) ? | Reference |
---|---|---|---|---|---|
K&R | spaces | 2 | no (allowed) | Google C++ style guide, Google Java style guide | |
V8 (JavaScript engine) (Google) | K&R | spaces | 2 | no | code |
HHVM (Facebook) | K&R | spaces | 2 | no | HHVM guidelines, see also code |
Proxygen (Facebook) | K&R | spaces | 2 | no | code |
Phabricator (Facebook) | K&R | spaces | 2 | no | Phabricator coding standards |
.NET CLR (Microsoft) | Allman | spaces | 4 | no | code |
C# Guidelines (Microsoft) | Allman | spaces | 4 | no | link |
TypeScript (Microsoft) | K&R | spaces | 4 | no | code |
Sun Java JRE/JDK (Oracle) | K&R | spaces | 2 | no | code |
Linux Kernel | K&R | tab | 8 | no | Coding Style |
IntelliJ (JetBrains) | K&R | spaces | 2 | no | code |
Nginx | K&R | spaces | 4 | no | code |
LLVM | K&R | spaces | 2 | no | code |
JavaScriptCore (Apple) | K&R | spaces | 4 | no | code |
WebCore (Apple) | K&R | spaces | 4 | no | code |
SystemD (RedHat) | K&R | spaces | 8 | no | code |
KDE | K&R | spaces | 4 | no | code |
GNOME | K&R | spaces | 8 | no | code |
RequireJS | K&R | spaces | 4 | no | code |
Apache | K&R | spaces | 4 | no | code |
Firefox | K&R | spaces | 2 | no | code |
Chromium (Google) | K&R | spaces | 2 | no | code |
LibreOffice | Allman | spaces | 4 | no | code |
PHP | K&R | tabs | 8 | no | code, code 2 |
Vim | Allman | mixed | 4 | no | code, code 2 |
Git | K&R | tabs | 8 | no | CodingGuidelines, code |
Why does what other projects do matter? Only for familiarity. Switching between code written in different styles can be jarring. With most code having been written in K&R, the code you write will feel more familiar to others and others' code will feel more familiar to you by sharing the style.
The requirement of #2, that all software displaying the code be configured to have the correct tab size, is at best inconvenient, at worst impossible. The problem is there is a large and diverse range of software that will be involved in displaying your code, including:
gdb
, lldb
, hphpd
, Firefox/Chrome JavaScript debugger...)git diff
, GitHub app, Meld, TortoiseGit, KDiff...)Not to mention code samples that are put in emails, chat messages, code review/bug tracker issues/comments, blog posts and presentation slides.
Each tool will have it's own default display size for a tab (usually 8) and each tool may or may not let you change it. GitHub, for example, renders tabs with 8 spaces, which can only temporarily be changed by adding ?ts=...
to the URL and reloading the page. Consequently, using tabs, you are destined to find yourself reading your code with the wrong indent size (usually 8), whether you like it or not, and depending on the tool, you may not be able to do anything about it. It is not possible to impose the requirement of user-configurable tab sizes on all the software which happens to render your code.
Spaces impose no such requirement. By embedding the correct tab size directly in the code, the code is rendered correctly everywhere, even if you cut and paste it into an email.
So what should the tab size be? The size is a balance between:
For this purpose, 2 or 4 is reasonable and common. I find 2 spaces to be preferrable because it conserves horizontal real estate and often provides a near symmetry between the height of a line and the side of an indent.
From Steve McConnell's Code Complete Second Edition chapter on Layout and Style:
Subjects scored 20 to 30 percent higher on a test of comprehension when programs had a two-to-four-spaces indentation scheme than they did when programs had no indentation at all. The same study found that it was important to neither under-emphasize nor over emphasize a program’s logical structure. The lowest comprehension scores were achieved on programs that were not indented at all. The second lowest were achieved on programs that used six-space indentation. The study concluded that two-to-four-space indentation was optimal. Interestingly, many subjects in the experiment felt that the six-space indentation was easier to use than the smaller indentations, even though their scores were lower. That’s probably because six space indentation looks pleasing. But regardless of how pretty it looks, six-space indentation turns out to be less readable. This is an example of a collision be tween aesthetic appeal and readability.
Class members can be classified along four different dimensions:
All else being equal, class members should be sorted in the order of: staticness, type, visibility, abstraction. For example:
abstract class Foo {
use Trait1;
// static members
const _1 = 0;
static public $_1;
static protected $_2;
static private $_3;
static public final function _1() {}
static public function _2() {}
static protected final function _3() {}
static protected function _4() {}
static private function _5() {}
// instance members
public $_4;
protected $_5;
private $_6;
public function __construct() {}
public final function _6() {}
public function _7() {}
public abstract function _8();
protected final function _9() {}
protected function _10() {}
protected abstract function _11();
private function _12() {}
}
It is permissable to use a different ordering if the circumstances favour it. For example, a very large class may be more easily navigated with members arranged by topic. (However a class of that size should probably be split up where possible.)