The Importance of Naming in Computer Programming
Tommy Elliott | August 26th, 2020
Who has ever encountered a variable name of `obj` or `item` or `num` or `str` or `ret` or `val` or `test` or `count` when reading code?
I'd expect that nearly all programmers who read others' code regularly have experienced some of these, or similarly unclear or unspecific, names before. More often than not with such names, you have to investigate further to determine the full meaning of the named thing. Once you've deduced that full meaning, do you think you could come up with a better name, one that more accurately describes the true meaning of the thing?
Why does it matter?
When we consider the audiences of source code in computer programming, the primary audience is obvious to all programmers. Some computer process will parse through the code to turn it into the desired functionality/display/feature/etc - to change it into 1's and 0's that ultimately affect either some data or some display or both, so that someone's life can be made better in some way. A program with every variable/class/entity named like `w48293` (some letter followed by some number) could create the same functional result for this audience as a program with great naming.
This is why naming matters.
The other audience for that code is every other programmer that ever needs to understand or maintain that code in the future, which often also includes your future self. That can be quite a lot of people when you consider both the lifetime of the product and the size of the programming team or community (now and in the future). That path to understanding can also represent quite a lot of time, effort and money.
The Costs and Savings due to Naming Quality
The quality of code is often defined by how efficiently and effectively the code can be understood, maintained, and worked on (aside from performance issues). Higher code quality leads to better products over the long term, fewer issues, and less time and effort required to maintain the product, lower code quality leads to the opposite.
Poorly thought-out, first-thing-that-pops-into-your-head names may seem like the easiest or most obvious choice at the time they're created, but please take the few extra moments to carefully consider naming each thing so it can be understood from an outsider's perspective. It is very easy to fall into the trap of being so immersed in a project or concept that names we create make sense to us at the time, but wouldn't make sense to someone seeing the program for the first time. We don't want names to require investigation of the context or require any in-depth product knowledge (I've-been-working-on-this-for-two-months-straight-so-I-know-exactly-what-this-acronym-means), especially when a little extra thought now could overcome that.
When we read a name in programming and it isn't specific or clear enough to describe the value or entity it represents, it forces us to investigate further. We are forced to look for context within the code block and sometimes even further to any, or all uses of that name. We can look to nearby comments, but comments require maintenance and may become stale, useless, or even incorrect if not maintained properly. Sometimes we have to go even further than that - to find related entities based on part of a name. If some of those related names have abbreviations/inconsistencies/etc, searching could provide incomplete results, leaving us with a high probability of making mistakes during refactoring, or extra time spent manually finding each related entity and extra uncertainty about what might have been missed. And if naming conventions like camelCase or snake_case or PascalCase aren't applied consistently within the language and your programming organization, things get even harder to find, classify, group, refactor, enhance, etc. What's the takeaway here? Maintaining code with bad naming leads to more problems and expenses, especially in the long run, compared to maintaining code with good naming.
The easiest way to improve naming practices in coding is to simply take a few extra moments to consider each name you create with a critical eye. Could this name be interpreted in ways besides the way I intend it? Does this name fully describe the meaning of the represented value or entity? Does this function name accurately communicate what it does? What values will this variable hold at different points in time, and is the name accurate for all of those? Is this name consistent with the other names used in the product/language/organization/industry?
Since others have written on this topic many times in the past, and one particular instance stood out to me because it is backed by research and includes numerous helpful suggestions, I will first point you to that excellent article here for those who are interested in seeing more detail and a larger collection of suggestions; and in this current post I'll summarize some of the guidelines that are most impactful for me. In your future programming, try to consider these and their potential impact when you name things in programming.
Syntax and General Guidelines:
-
Use naming conventions of the programming language being used. Capitalization, multiple word names syntax, and the general language conventions for various programming elements like classes, interfaces, properties, constants, and so on, should be followed everywhere except where your organizational coding standards intentionally differentiate.Refactor: Apply standard language-specific casing, and use language-specific code inspection tools to enforce it.
-
Avoid differentiating identifiers with number suffixes. Number suffixes don't add meaningful distinction between variables with the same base name and a different number at the end.Refactor: Use additional or different words in names to differentiate meaning.
-
Fully spell out dictionary words, with correct spelling. Abbreviated or incorrectly spelled words, even when used repeatedly, cause confusion, ambiguity, and inconsistency. They also make searching for uses of specific words difficult. Abbreviations also have the possibility of being interpreted as different words than intended in many cases (like `mod` - it could mean module, modulus, mode, modal, model, modem, or modality). Typing the full word, spelled correctly, makes it reliably searchable, and makes the original intent explicit, instead of forcing the readers to interpret the ambiguous meaning for themselves (wasting time better spent).Refactor: Correct spelling of misspelled words, and change abbreviated words to the full words. Make exceptions for unambiguous well-understood values like ID, and documented domain-specific acronyms and abbreviations, when they aren't ambiguous.
-
Name constant values. For unnamed constants, which include magic numbers, each typically invoke a thought response of "Why isthat there, and what does it represent?" (another time-waster for readers).Refactor: Give each constant value a name, specifically for what it represents, not just for the value it holds (WaterBoilingPointCelcius instead of 100 orOneHundred; DaysPerWeek instead of 7 or Seven).
-
Place Qualifiers as suffixes when multiple variations exist. Doing this naturally groups similar names together, improving efficiency for programming using those names.Refactor: Move qualification wording to the end of names (AgeMaximum AgeMinimum AgeAverage).
Vocabulary Guidelines:
-
Describe meaning. One of the most important ways to ensure good naming is to apply names that conceptually describe what the identifier represents. Avoid thoughtless/meaningless names (foo blah temp).Refactor: Describe what the identifier represents.
-
Choose wording that has a single specific clear meaning. Consider whether wording is too vague; prefer wording with greater specificity. Consider whether wording could lead to multiple interpretations; prefer wording thatminimizes the possibility of being misinterpreted.Refactor: Replace vague wording (data, object, value) that could apply to a variety of data, with more specific words that would only be correct for this name. Replace wording that has multiple possible interpretations or meanings (set, run, check, call, place, clear, break - see https://muse.dillfrog.com/lists/ambiguous for more examples) with wording that has fewer possible interpretations or meanings.
-
Use a large vocabulary. Prefer a richer single word over multiple words when it fits the concept. The thesaurus is your friend; just do a web search for your concept, adding "synonyms" to the search, and you'll often be able to find a word or words that more concisely and/or more exactly describe the concept.Refactor: Replace multiple words describing a concept when "there's a word for that" (Employee instead of CompanyPerson).
-
Use problem domain terms. Consistently use and apply the correct terminology that subject-matter experts use in the problem domain. This helps to clear up communication issues when features and requirements (programming details) need to be discussed with industry experts.Refactor: Rename identifiers to use the correct terminology in the problem domain.
-
Compose names that won't be confused with other names. Avoid using names that differ from existing names only by spelling (with the same phonetic pronunciation when spoken), or only by wording (with the same meaning), or only by a few letters, or only by word order. Use appropriate distinguishing words, and don't use names that have the same meaningas each other. The goal here is to prevent any misunderstandings about what distinguishes each identifier, and to avoid confusing one for another.Refactor: Make differences more explicit by adding or changing or using different words. Each identifier should be easily distinguishable from all other identifiers.
Data Type Guidelines:
-
Use singular names for single values, and use plural names for collections. Only pluralize names that refer to collections like lists. Having a single value with a pluralized name (or vice versa) is unintuitive, and leadsto worse comprehension.Refactor: Make single value identifiers use singular names (carCount instead of carCounts), and make collection identifiers use pluralized names (remainingCars instead of remainingCar).
-
Use appropriately matched opposites or wording pairs. Consistently apply appropriate word pairing. Some typical pairs include: add/remove, begin/end, create/destroy, destination/source or target/source, first/last, increment/decrement, insert/delete, lock/unlock, minimum/maximum, next/previous, old/new, open/close, show/hide, start/stop or start/finish, and up/down.Refactor: Use the correct opposite or paired word, and use it consistently.
-
Use appropriate grammatical forms for Boolean names so they imply their value will be either true or false. The name alone should indicate to the reader that the value will be either true or false.Refactor: Reword Boolean names so that their value is implied to be either true or false (Started or IsStarted instead of Start or Begin or Status or Progress).
-
Use positive Boolean names. You avoid a lot of confusion by keeping Boolean values as either simply positive (without the programming not operator - like isEnabled) or negative (with the programming not operator - like !isEnabled). When you use a negative Boolean name (isDisabled or isNotEnabled), negating it makes for a double negative (!isDisabled or !isNotEnabled), which takes more effort to comprehend and keep track of in Boolean logic.Refactor: Invert the name's meaning so it indicates a positive Boolean (remove the "not" or replace the negative wording with positive wording), and apply or remove the corresponding programmatic negations in existing uses to incorporate the new meaning.
Class and Method Name Guidelines:
-
Name classes to be read as a noun phrase, and methods to be read as a verb phrase. A class name should represent an entity or a classification - a noun (not an action). A method name should represent taking action or doing something - a verb (not an entity). A function, which takes action to return a result, should have a name that represents taking the action and usually also indicate what to expect for the result.Refactor: Rename classes to represent a noun (IndividualTestResult instead of TakingATest), and rename methods and functions to represent a verb (CalculateTestResult instead of TestResult).
-
Name classes to be inclusive to all possible states and values the class could represent. When a class could exist in multiple statuses or contain a variety of values, name the class to be inclusive to all of those states and varieties.Refactor: Make the class name less specific to accommodate all possible states and varieties for that class (CargoVehicle instead of Truck, if the class represents any vehicle that can transport cargo).
-
Be intentional and transparent with use of prefixes (get/set/is/has), validation words (validate/check/ensure), and transformation words (transform/convert/as/to) in method naming. There are inherent expectations built into most of these words. "get" is expected to read/retrieve a value without any side effects. "set" is expected to save/overwrite a value without any side effects. "is" and "has" are both expected to read/retrieve a Boolean value without any side effects. "validate" and "check" and "ensure" are all expected to return the result of that validation, including having some resulting indication of failure. "transform" and "convert" and often "to" and "as" in method names are typically expected to perform some transformation on an input and to return the transformed object as the output.Refactor: If any wording gives an expectation that isn't met, either change the method/function to meet the expectation or change the name to not imply that expectation. That includes wording that leads the reader to expect no side effects; if there are side effects, rename the method/function to better represent all of what it does (or refactor it into multiple functions each having a single purpose).
There is future cost (additional effort, frustration, confusion, mistakes, etc) incurred by every unclear, ambiguous, imprecise, abbreviated, misleading, or incomprehensible name used in our profession. The frequency of bad names is a big factor - the more prevalent they are, the more confusing the code as a whole becomes. The lifespan of bad names is a big factor - the longer poor names are left unchanged, the more overhead they incur, so when you improve a name after spending the time to figure out its true purpose and a more fitting name, then it means that extra time doesn't need to be spent again by the future programmers who read it. Granted, with renaming there is usually a tradeoff involved (effort spent now vs effort saved later), and renaming some things requires more effort than renaming others, but in many cases the extra few seconds or minutes it takes to rename a thing ends up being well worth it for the purpose of saving time and preventing issues for future programmers. And lastly, consistency among names is a big factor - plurality, naming syntax for classes/properties/functions/etc within the language, abbreviations, acronyms, and capitalization are all areas where inconsistencies or ambiguities can greatly hinder or cause issues for future developers trying to find or reference specific things or categories of things. Many languages have pretty standardized conventions, and beyond that, it is important that your programming organization defines (or references) coding standards to follow for the languages they use most. Incorporating code review into your normal programming process to validate that code follows the organization's coding standards, and to improve code quality by asking a second person to try to understand that code makes a big difference in reducing future problems and technical debt.
I hope you've now seen that spending that little extra effort to name things well saves future programmers time and money, and prevents confusion, frustration, and mistakes when they revisit your code. Practice and master naming things clearly and meaningfully; your team, current and future programmers, and possibly even your future self will all be grateful to you for it.