


“The limits of my language are the limits of my world.”
—Ludwig Wittgenstein
Shared conventions make communication possible.
Language is a set of conventions important due not to any intrinsic property, but due to the extrinsic property that they are shared by others: knight is a better spelling than nait not because the former is more phonetic (which it no longer is, thanks to phonetic drift) but because it will be immediately understood by billions of other people, which the latter will not.
Early scribes spent centuries learning to put vowels in words and blanks between them. Editors draw on thousands of years of experience to make books more readable.
The world is moving faster today; we don’t have thousands of years of programming experience to draw on, nor can we spend centuries learning to put vowels in our identifiers. We have to work harder, faster and smarter to make our code readable. We have to be better.
Readability is everything.
Use common sense.
Don’t break rules capriciously, but do break rules when necessary.
“Short words are best and the old words when short are best of all.”
—Winston Churchill
When defining exported symbols, clarity trumps brevity: The client programmer can always define abbreviations as desired.
Be specific — call a rock a “rock”, not a “thing”. Do not be coy; do not keep secrets from the reader. Programs are not murder mystery novels.
When you pick an external identifier, your target audience should be someone who has never heard of your package, someone who is diving into an unfamiliar ten-million-line program with thirty minutes to fix an obscure bug before people start dying. This person does have time to puzzle out cryptic identifiers; they need to be blindingly obvious.
Save a life: Make your external identifiers exactly as long as they need to be, neither more nor less. Sweat blood to make them clear.
“The difference between the right word
and the almost right word is the difference
between lightning and a lightning bug.”
—Mark Twain
Identifier length should be proportional to scope and inversely proportional to frequency of use.
Favor short old words over long neologisms.
Favor complete words over word fragments and abbreviations; use the latter only when they unquestionably improve readability.
Use verbs to name functions and nouns to name other values.
Comment abbreviations when introduced if not absolutely obvious.
When adjacent identifiers contain underbars or double-colons, separate them by a double or triple blank:
foo bar zot # Single blanks fine here.
foo_bar zot # Double blanks needed here..
Use nouns or noun phrases.
Exception: If the package encapsulates only an algorithm, use a verb or verb phrase.
Favor the singular over the plural: snark.pkg not snarks.pkg. (But do use the latter when implementing sets of snarks.)
Layout is the art of using syntax to elucidate semantics.
We use whitespace, indentation, alignment and bridge comments to make the code’s logical structure leap off the page for the reader.
Indent four blanks per nested scope.
Neatness counts!
Where practical, line stuff up to take advantage of early stages in the visual processing pipeline.
For example, reformatting
my (f, e) = if (f < 1.0) scale_up (f, e);
elif (f >= 10.0) scale_dn (f, e);
else (f, e); fi;
as
my (f, e)
=
if (f < 1.0) scale_up (f, e);
elif (f >= 10.0) scale_dn (f, e);
(f, e);
fi;
makes the code easier to read.
Similarly, it is much harder to spot the misspelling in
fun is_const (VARIABLE_IN_EXPRESSION _) => FALSE; is_const ( VALCON_IN_EXPRESSION _)=> TRUE; is_const ( INT_CONSTANT_IN_EXPRESSION _)=> TRUE; is_const ( UNT_CONSTANT_IN_EXPRESION _) =>TRUE; is_const (FLOAT_CONSTANT_IN_EXPRESSION _) => TRUE; is_const (STRING_CONSTANT_IN_EXPRESSION _)=> TRUE; is_const ( CHAR_CONSTANT_IN_EXPRESSION _) => TRUE; is_const ( FN_EXPRESSION _) => TRUE; end;
than in
fun is_const ( VARIABLE_IN_EXPRESSION _) => FALSE;
is_const ( VALCON_IN_EXPRESSION _) => TRUE;
is_const ( INT_CONSTANT_IN_EXPRESSION _) => TRUE;
is_const ( UNT_CONSTANT_IN_EXPRESION _) => TRUE;
is_const ( FLOAT_CONSTANT_IN_EXPRESSION _) => TRUE;
is_const (STRING_CONSTANT_IN_EXPRESSION _) => TRUE;
is_const ( CHAR_CONSTANT_IN_EXPRESSION _) => TRUE;
is_const ( FN_EXPRESSION _) => TRUE;
end;
Be nice to your precortical visual pathway and it will be nice to you.
“Consistently separating words
by spaces became a general custom
about the tenth century A.D.,
and lasted until about 1957,
when FORTRAN abandoned the practice.”
—Sun FORTRAN Reference Manual
“The right word may be effective,
but no word was ever as effective
as a rightly timed pause."
—Mark Twain’s Speeches
Whitespace is Your Friend. Use it liberally to enhance readability. Open up your code; give it room to breathe.
Use bridge comments to visually connect the dots.
For example, often the proximity of the first two lines of a close-packed case statement confuses the eye
case (mimble mamble mumble)
TIMBLE => tamble tumble;
FIMBLE => famble fumble;
esac;
but adding a blank line makes the case statement visually fall to pieces:
case (mimble mamble mumble)
TIMBLE => tamble tumble;
FIMBLE => famble fumble;
esac;
A bridge comment gives the code room to breathe while still tying it together into a visual whole:
case (mimble mamble mumble)
#
TIMBLE => tamble tumble;
FIMBLE => famble fumble;
esac;
“Writing it is easy, understanding it is hard.”
—Anonymous
Thou shalt not wrap useless parentheses around entire case expressions.
Thou shalt not wrap useless parentheses around entire rule patterns.
The canonical layouts are
case expression
#
pattern => expression;
pattern => expression;
pattern => expression;
...
esac;
case expression
#
pattern
=>
{ statement;
statement;
statement;
...
};
pattern
=>
{ statement;
statement;
statement;
...
};
pattern
=>
{ statement;
statement;
statement;
...
};
...
esac;
Avoid mixing the two models. If you must have both mono-line and multi-line alternatives within the same case, group the mono-line alternatives together at the top if possible.
“Real Programmers don’t comment their code.
It was hard to write; it should be hard to read.”
—Anonymous
Lay out records like case statements, but with two-blank initial indents:
{ key => value,
key => value,
key => value
};
{ long_key
=>
big_epression,
long_key
=>
big_epression,
long_key
=>
big_epression
};
As always, try to put the shortest alternatives first.
“Easy writing makes damned hard reading.”
—Richard Brinsley Sheridan
Multi-key except statements are implicit case statements. Lay them out accordingly.
The canonical layouts are
expression
except
key = expression;
expression
except
long_key
=
{ statement;
statement;
statement;
...
};
expression
except
key => expression;
key => expression;
key => expression;
...
end;
expression
except
long_key
=>
{ statement;
statement;
statement;
...
};
long_key
=>
{ statement;
statement;
statement;
...
};
long_key
=>
{ statement;
statement;
statement;
...
};
end;
“I notice that you use plain, simple language,
short words and brief sentences. That is the
way to write English — it is the modern way
and the best way. Stick to it; don’t let fluff
and flowers and verbosity creep in.
“When you catch an adjective, kill it.
No, I don’t mean utterly, but kill most
of them — then the rest will be valuable.
They weaken when they are close together.
They give strength when they are wide apart.
“An adjective habit, or a wordy, diffuse,
flowery habit, once fastened upon a person,
is as hard to get rid of as any other vice.”
—Mark Twain
Default format is:
fun foo arguments
=
body;
In the typical case where the body contains more than one statement, this becomes
fun foo arguments
=
{ statement;
statement;
}
With a long argument list this becomes one of
fun foo
argument
argument
argument
...
=
{ statement;
statement;
}
fun bar
(
argument
argument
argument
...
)
=
{ statement;
statement;
}
Use a where clause to improve readability when the function body consists of some definitions combined in the result:
fun foo arguments
=
bar zot
where
bar = expression;
zot = expression;
end;
Use one-line function definitions only to expose parallelism:
fun foo = tum diddle dum;
fun bar = tum diddle dee;
Pattern-matching function definitions are implicit case statements. Lay them out accordingly:
fun foo arguments => expression;
foo arguments => expression;
foo arguments => expression;
...
end;
fun foo arguments
=>
{ statement;
statement;
statement;
...
};
foo arguments
=>
{ statement;
statement;
statement;
...
};
foo arguments
=>
{ statement;
statement;
statement;
...
};
end;
“Strunk felt that the reader was in serious
trouble most of the time, a man floundering
in a swamp, and that it was the duty of anyone
attempting to write English to drain the swamp
quickly and get his man up on dry ground, or
at least throw him a rope.”
—EB White
Thou shalt not wrap useless parentheses around entire if conditions.
The canonical if statement layouts are
if condition action; fi;
if condition action;
else action;
fi;
if condition
#
big expression;
else
big expression;
fi;
if condition
#
statement;
statement;
...
fi;
if condition
#
statement;
statement;
...
else
statement;
statement;
...
fi;
Use the most readable alternative.
Fine points:
The canonical layouts are
condition ?? expression :: expression;
condition ?? expression
:: expression;
If neither of those work, use an if.
“Do not say a little in many words,
but a great deal in a few.”
—Pythagoras (582-497 BCE)
“Omit needless words! Omit needless words! Omit needless words!”
—Will Strunk
Commenting is a form of expository writing, and as such the rules of expository writing apply:
Briefer is better — but clarity beats brevity.
If you don’t already have a copy, buy and read Strunk and White’s Elements of Style. It is the best book on commenting available. Get the classic version they wrote, not the recent version mangled after their deaths without their permission nor taste.
Break comment lines at 40-50 characters — 72 maximum.
Write high-level comments motivating the package as well as low-level ones elucidating details.
Put a motivating comment before each major function. Use short imperative sentences:
# Boojum the snarks thrice each
# to re-establish the softly and
# silently vanishing invariants:
#
fun boojum_snarks snark_list
=
{
...
};
Do not use comments as a crutch. If you find yourself writing
bpl = []; # Breakpoint list.
it means you should rename bpl to breakpoint_list. (When cleaning up other people’s code I find that more often than not the comment where an identifier is declared contains the proper name of that identifier.)
Don’t be stupid. Comments like
close file; # Close file.
do not help anyone. Make every word count.
Do not needlessly break a sentence or clause across lines. For example, do not write
# Oh frabjous day, we have a boojum. Softly
# and silently steal it away.
but rather
# Oh frabjous day, we have a boojum.
# Softly and silently steal it away.
In general it is better to use subpackages rather than identifier prefixes for datatype namespace management. For example
package wa {
Window_Attribute
= BACKGROUND_NONE
| BACKGROUND_PARENT_RELATIVE
| BACKGROUND_RW_PIXMAP dt::Rw_Pixmap
| BACKGROUND_RO_PIXMAP dt::Ro_Pixmap
| BACKGROUND_COLOR rgb::Rgb
#
| BORDER_COPY_FROM_PARENT
| BORDER_RW_PIXMAP dt::Rw_Pixmap
| BORDER_RO_PIXMAP dt::Ro_Pixmap
| BORDER_COLOR rgb::Rgb
#
| BIT_GRAVITY xt::Gravity
| WINDOW_GRAVITY xt::Gravity
#
| CURSOR_NONE
| CURSOR cs::Xcursor
;
};
is better than
Window_Attribute
= WA_BACKGROUND_NONE
| WA_BACKGROUND_PARENT_RELATIVE
| WA_BACKGROUND_RW_PIXMAP dt::Rw_Pixmap
| WA_BACKGROUND_RO_PIXMAP dt::Ro_Pixmap
| WA_BACKGROUND_COLOR rgb::Rgb
#
| WA_BORDER_COPY_FROM_PARENT
| WA_BORDER_RW_PIXMAP dt::Rw_Pixmap
| WA_BORDER_RO_PIXMAP dt::Ro_Pixmap
| WA_BORDER_COLOR rgb::Rgb
#
| WA_BIT_GRAVITY xt::Gravity
| WA_WINDOW_GRAVITY xt::Gravity
#
| WA_CURSOR_NONE
| WA_CURSOR cs::Xcursor
;
The crucial difference is that the subpackage formulation gives the application programmer the option of abbreviating
case attribute
#
wa::BACKGROUND_NONE => ... ;
wa::BACKGROUND_PARENT_RELATIVE => ... ;
wa::BACKGROUND_RW_PIXMAP _ => ... ;
wa::BACKGROUND_RO_PIXMAP _ => ... ;
wa::BACKGROUND_COLOR _ => ... ;
wa::BORDER_COPY_FROM_PARENT => ... ;
wa::BORDER_RW_PIXMAP _ => ... ;
wa::BORDER_RO_PIXMAP _ => ... ;
wa::BORDER_COLOR _ => ... ;
wa::BIT_GRAVITY _ => ... ;
wa::WINDOW_GRAVITY _ => ... ;
wa::CURSOR_NONE _ => ... ;
wa::CURSOR _ => ... ;
esac;
as
{ include wa;
case attribute
#
BACKGROUND_NONE => ... ;
BACKGROUND_PARENT_RELATIVE => ... ;
BACKGROUND_RW_PIXMAP _ => ... ;
BACKGROUND_RO_PIXMAP _ => ... ;
BACKGROUND_COLOR _ => ... ;
BORDER_COPY_FROM_PARENT => ... ;
BORDER_RW_PIXMAP _ => ... ;
BORDER_RO_PIXMAP _ => ... ;
BORDER_COLOR _ => ... ;
BIT_GRAVITY _ => ... ;
WINDOW_GRAVITY _ => ... ;
CURSOR_NONE _ => ... ;
CURSOR _ => ... ;
esac;
};
but the prefix formulation allows no such convenient de-uglification trick.
This rule is a special case of: Favor explicit representations over implicit ones.
Any .pkg file longer than a screenful should have an explicitly defined API, usually in an .api file, occasionally at the top of the .pkg file.
Favor strong sealing when in doubt. (But some packages will need to use weak sealing in order to export sufficient type information to allow equality comparisons to do what you want.)
Reading your API definition (and any dependent documentation) should be sufficient for use; client programmers should not have to read the pkg definition proper in order to use it.


