Jonathan's Blog: Principles of Programming

February 2025
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

Jonathan's Rules for Writing Effective Programs

I've written my share of apps; almost all database-driven, about equal web or client. I've also maintained them, and had to extend or rewrite bad code. Having pondered what makes good code for many hours, I've come up with these principles. You may agree or disagree; please comment either way.

95% of design problems derive from not anticipating the full scope that the program will need. For example, building a database UI with the assumption that there will only be one user at any given time, or an operating system designed to work with 640K of RAM. Think big; don't limit yourself unnecessarily.

These principles are more-or-less in descending order.

1. Clarity is your first responsibility.

Example

Do NOT be "cute" or "clever". Name your dog "Copy2Server" if you want to, but not a method. I once saw a folder named "Belfrey" (sic) just because it has .bat files. Don't make the maintenance problem spend an hour tearing his hair out over a SordidList object before finally realizing that you thought you were being clever with your sorted list.

Writing the code in the first place is the easy part: 90% of the time invested in an app over its lifetime is maintenance. A clever solution early on will be a headache later.

Don't worry about efficiency; in most cases, efficiency will follow from clear code. Even if there is a need to optimize later, and this is less and less common today, having clear code will make it much easier to do.

An example of clarity is clear purpose: what is the application meant to do? What is this module for? What does this method do? If you can't answer these questions in a sentence or two, you may need to break up your project (or your module, or your method) into smaller units, or think harder on what they're for.

2. Use the library whenever you can.

When you roll your own code, you force other people to learn about exactly how you chose to address a problem. Coding is not a soapbox or an opportunity to show off; just because your code is 2% faster, or even 20% faster, does not justify taking the time to write what's already been written. Use the library unless it truly cannot do what you need it to do, or the performance is truly unacceptable.

2B: Use the existing code base when you can. Sure, it may look like a mess, but it's probably not as bad as it looks. Code is usually easier to write than to read, but make the effort and understand what came before you. A wise programmer does not assume that just because he writes new code, it'll automatically be better than old code.

Note that it can be a very interesting and useful exercise to reinvent a wheel; just don't do it unless you have time to burn.

3. Document the intention of the code.

Good code is somewhat self-documenting, but don't kid yourself that it needs none at all. Use comments, but don't just parrot what the code says. The best way I've heard it put is: "the code describes what it's doing, the comments describe why."

In most cases, you should write your comments first, and your code second. Start with an outline, then stub class files, then stub interfaces with comments, then start populating the interface and writing private methods.

On the subject of comments: when you have to use an ugly, unusual, or counter-intuitive construct, be sure to document it. Otherwise, some well-intentioned person is sure to clean it up, thereby breaking your code.

Comments are also critical to document your assumptions.

Finally, be sure to document non-obvious side effects. For example, if a method modifies its parameters (assuming they're passed by reference), you must record this fact.

There's no hard-and-fast rule about how many comments to include, but in general a very expressive language should have one or two lines of code per comment, while a very verbose language may have four or five.

Use proper english in your comments, at least to the level of an e-mail. Remember that you're making a permanent record, and three months or three years from now someone may spend hours puzzling over your code just because you didn't specify whether it uses a table or the table, for example.

4. Use effective error handling.

Side Benefit

Effective try-catch blocks help ensure that you always close external objects such as files, database connections, and network connections.

Always check for exceptions around risky code: processing user input, performing IO, network operations, allocating memory, etc. Catch specific anticipated exceptions, but pass on others. If you're opening a file, by all means gracefully handle file not found exceptions, but don't just swallow all errors; if you get a division by zero, that should bubble up the stack.

NEVER silently swallow errors, though some exceptions can be safely ignored.

Validate input, ESPECIALLY when it comes from users (i.e., text boxes, URLs). Do not trust users to enter numbers in number fields or dates in date fields. Use enumerations to help ensure that parameters are valid.

5. Build reusable black box modules.

Side Benefit

In a team environment, you can delegate the design and construction of individual modules, specifying the required interface but leaving the details to the developer in charge. You can also have alpha-beta-release generations for each module separately, so when you have a beta of module X you can test it against stable releases of modules Y and Z, even if they're under active development.

The next best thing to using the built-in library is using your own library. You save time and, if your code is well-written (a black box), the maintenance programmers that follow you may not have to learn it. Thinking in terms of reusability is a good habit to get into.

Make your methods and modules black boxes: a user should not need to know how they work, just the inputs allowed and the outputs produced. A method or module should not have external dependencies or magic numbers. If you can't help having such connections (wires coming out of the black box), document them.

6. Write to test.

On serious projects most of your code should be tested with an external tool. This lets you speak with confidence when you make a claim about the robustness of your code. Another benefit of this is refactorability: you can make a change to your code, run the comprehenensive test, and be confident that you haven't broken anything up- or downstream. It also reveals many unexpected knock-on effects. For example, if a change in one method affects another in an unintuitive way, a good test rig will find it, and (very important!) will find it immediately, making the cause clear. Otherwise the dependency may not surface until much later as a hard-to-reproduce error in a customer's system where some odd corner case occurs.

Think of testing as an extension of the verifications you've always done. A test rig is basically an elaborate assert.

If you can, use a production-like testing environment, with similar hardware and data volumes, at least some of the time. If you can't, at least use realistic data, such as a random subset of production data.

Bugs do not magically fix themselves. Changes in tangentially related code may hide an error, causing it to drop off the radar, but a good comprehensive test will find it. This will keep you from thinking you've solved a problem, only to see it come up weeks later, probably during a customer demo.

7. Use good style, especially names.

While style is largely a personal matter, there are some good and bad practices. Consistency is especially important; whether one puts opening braces on a new line or not hardly matters, as long as it's the same throughout your code.

Follow the conventions: methods are usually verb + noun, fields are usually nouns, getters are get + noun (except booleans, which are usually is + adjective). Be internally consistent.

Don't be afraid of long names, it's far more important to be clear than concise, though both is ideal. Auto-completion is a wonderful thing. Use it.

Do not use abbreviations unless they're universally known: "num" and "http" are OK, but "tbl" or "usr" is asking for trouble: you (and later programmers) will have to memorize or guess whether you used an abbreviation, or not, and if so, which one; worst of all, they may misunderstand what the abbreviation actually means. Initialisms are especially bad, as they're often very opaque, though very common and unmistakeable initialims (e.g., HTTP, FTP) are OK.

8. Write short methods.

The chance of developing a bug rises as a section of code gets longer. As a rule of thumb, if you can't see the method in a single screen, roughly 40 lines, you probably need to break it up. When splitting a method, don't just cut it into thirds from top to bottom; pull out logical chunks, and give them logical names. This is a good time to consider if any of the code might be useful again someday; maybe it should go into a separate module.

Another guideline: if you're not sure what to name a method, it's probably doing more than one thing. Break it up.

9. Do not repeat code.

If you're performing the same operations in more than one place, pull it out into a separate method and call it instead. The benefits should be obvious: if you ever need to change the code, you only have to do so in one place. Otherwise, you're just begging for someone to change one method but not the other, which is especially risky if the old code did in fact work, just not in the desired manner.

Don't repeat values, either. Magic numbers should be stored in constants anyway, but if especially if you use the same literal twice you should certainly store it somewhere. Even basic logical values, like zero for a null pointer or -1 for true should probably be replaced with constants (at a pre-compiler level, if possible).

10. Log everything.

When writing code, you can step through it, add scaffolding, peek at locals, and generally collect all the information you need. Once it goes out the door, you'll be at the mercy of your users. So log everything: log when the application starts, where the user is connecting from, which security credentials are used, which files are opened, etc. Consider logging before and after attempting a risky action, so you can quickly identify or eliminate them: if your log indicates that the connection to the database was OK, that lets you skip testing it. Include a precise date/time stamp with each entry.

11. Put configuration data into a configuration file.

Side Benefit

You can easily set your application to point to a different input source, or output destination, for testing purposes.

Do not store connection strings, file locations, or maximum values directly in your code, put them in a config file. Encrypt it, if necessary. If you can't do this, at least put all these parameters into constants, and consider putting these constants into a module or class which just stores configuration data. In relatively complex applications, you should do the same with strings, so you can easily support multiple languages.

12. The UI should be an important part of the design, not an afterthought.

Some bearded UNIX diehards may scoff, but the truth is that if your app has an interface, and it's ugly, people will think less of the program. If it is not convenient, they probably won't trouble themselves to learn it. UIs should be attractive, responsive, and balance displaying information without being cluttered or confusing.

Your UI is well-designed if your grandmother can sit down and start using it without coaching.

Don't make your users memorize checklists: that's what computers are for! If while preparing the user guide you find yourself writing something like "be sure to do this before that", stop. The users may not read the user guide; anything that they need to do should be enforced by the program itself. For example, a modern car must be in neutral (or the clutch must be down, for a manual), or the start sequence will abort. Rather than jolting the car, surprising the driver, and risking engine damage, the car simply will not start. Of course, it would be good if the vehicle gave specific feedback: "please put the car into neutral before starting," and better still if it took action on its own (shifted to neutral and then resumed the startup process).

13. Your code is well-written if a junior developer can understand it without assistance.

Find someone with some technical experience who's not already a guru in four languages. Send him or her your code, give a one-sentence description of what it does (NOT how it works) and then duct-tape your mouth shut. He or she should be able to figure out how to use it and make some basic modifications without a word of explanation from you. Pay attention to the questions asked, and then modify the interface and the code so that the answers are plain. Remember that you may not be around when this code has to be changed.

If there's no one available to review your code, take a look at something you wrote six months ago. Can you immediately understand what it's doing, and why? Could you extend it or modify it easily? If any of it leaves you confused or unsure, consider why, and resolve to do better in the future.

14. Remember the three rules of optimization.

The first rule of optimization: don't. The second rule of optimization: don't do it, yet. The third rule of optimization: profile first. Don't guess about where the delays are, profile and be sure. Also be sure to test the results of your optimizations; you'd be surprised how often "this MUST be faster" ends up subverting a compiler optimization and actually hurting your performance. If you must rewrite simple code to meet a performance goal, document it thoroughly.

All this applies to obfuscatory optimization, the kind where you have to be "tricky" or resort to lower-level code. If you can get better performance while making your code easier to read, then by all means go for it.

Very often, optimization will not be necessary if you start with an appropriate data structure. Use a collection with the best behavior for the task at hand: the difference between O(1) and O(n²) may not matter while testing, but it probably will for real-world data sets with thousands or millions of entries.

15. Build agnostic interfaces.

Side Benefit

With this kind of design you can more easily connect your code with a test data source, or hook it up to run from the command line when you don't need the UI.

Side Benefit

When a project is divided into layers, you can devote the programmer with the best skill set to that layer. E.g., the database developer writes sprocs and database IO code, the algorithms guy writes the middle tier, and the UI team designs and tests the interface.

When you build an interface, you create a layer of isolation between your code and other systems. You can refactor, redesign, or rebuild one layer without impacting others, without breaking other modules. Otherwise you have an unapproachable monolith.

Build your layers agnostically, with an emphasis on the interface methods rather than the implementation. Today's application may run against a flat text file and log to another, while displaying on a DirectX screen and reporting errors with a message box. Tomorrow it may read from a SQL database, write to an XML file, display in HTML with AJAX, and report via e-mail. Isolate the specific code for reading and writing into another module, or another layer, and let the meat of your program live in these systems. This will let you easily port your code as languages, inputs, outputs, and operating systems change.

16. Focus on preventing bugs, not finding them.

Let the compiler find bugs for you, by using compile-time structures like type-safe collections and compile-time casts.

Read through your code, and be sure that you really understand it. Consider possible inputs and reasonable exceptions. At some point you have to run it, but you shouldn't do that until you're reasonably confident that you're code is in fact correct.

It can be a useful exercise to write code without an IDE or compiler. If you write several hundred lines of code and it works on the first pass through, you're on the right track.

17. Fail fast.

Check your inputs before running complicated or time-consuming code that you may not need to run.

18. Use caching.

Store the results of slow operations, such as querying a database, unless you must have up-to-the-minute results every time. You can, of course, have code to refresh the cache, perhaps using stored data if it's less than x minutes old, refreshing otherwise.

19. Use lazy loading, where appropriate.

This one has plenty of exceptions. In most business apps, your program will have many more capabilities than each user individually needs, so there's no need to load data unless and until it's needed. On the other hand, in intensive applications that require maximum responsiveness, like games, it makes sense to pre-load so there's no chugging later. Of course, even games don't load everything up front, they're broken into levels.

20. Work atomically.

Don't set up broad security levels, like "user" and "admin"; rather users should have individual privileges. You may, and probably should, create sets of privileges that users can use to easily grant all the privs for a role, but remember that they may have a different idea about what a role includes, so let them get specific if they want.

Jonathan's Blog

Because journals are for prime ministers and Frenchmen

Topics

By Date

Style

Jonathan's Rules for Writing Effective Programs

1. Clarity is your first responsibility.

Example

2. Use the library whenever you can.

3. Document the intention of the code.

4. Use effective error handling.

Side Benefit

5. Build reusable black box modules.

Side Benefit

6. Write to test.

7. Use good style, especially names.

8. Write short methods.

9. Do not repeat code.

10. Log everything.

11. Put configuration data into a configuration file.

Side Benefit

12. The UI should be an important part of the design, not an afterthought.

13. Your code is well-written if a junior developer can understand it without assistance.

14. Remember the three rules of optimization.

15. Build agnostic interfaces.

Side Benefit

Side Benefit

16. Focus on preventing bugs, not finding them.

17. Fail fast.

18. Use caching.

19. Use lazy loading, where appropriate.

20. Work atomically.