Naming convention (programming)
In computer programming, a naming convention is a set of rules for choosing the character sequence to be used for identifiers which denote variables, types, functions, and other entities in source code and documentation.
Reasons for using a naming convention include the following:
- To reduce the effort needed to read and understand source code;
- To enable code reviews to focus on more important issues than arguing over syntax and naming standards.
- To enable code quality review tools to focus their reporting mainly on significant issues other than syntax and style preferences.
Potential benefits
Some of the potential benefits that can be obtained by adopting a naming convention include the following:- to provide additional information about the use to which an identifier is put;
- to help formalize expectations and promote consistency within a development team;
- to enable the use of automated refactoring or search and replace tools with minimal potential for error;
- to enhance clarity in cases of potential ambiguity;
- to enhance the aesthetic and professional appearance of work product ;
- to help avoid "naming collisions" that might occur when the work product of different organizations is combined ;
- to provide meaningful data to be used in project handovers which require submission of program source code and all relevant documentation;
- to provide better understanding in case of code reuse after a long interval of time.
Challenges
Readability
Well-chosen identifiers make it significantly easier for developers and analysts to understand what the system is doing and how to fix or extend the source code to apply for new needs.For example, although the statement
a = b * c;
is syntactically correct, its purpose is not evident. Contrast this with:
weekly_pay = hours_worked * hourly_pay_rate;
which implies the intent and meaning of the source code, at least to those familiar with the context of the statement.
Common elements
The exact rules of a naming convention depend on the context in which they are employed. Nevertheless, there are several common elements that influence most if not all naming conventions in common use today.Length of identifiers
Fundamental elements of all naming conventions are the rules related to identifier length. Some rules dictate a fixed numerical bound, while others specify less precise heuristics or guidelines.Identifier length rules are routinely contested in practice, and subject to much debate academically.
Some considerations:
- shorter identifiers may be preferred as more expedient, because they are easier to type
- extremely short identifiers are very difficult to uniquely distinguish using automated search and replace tools
- longer identifiers may be preferred because short identifiers cannot encode enough information or appear too cryptic
- longer identifiers may be disfavored because of visual clutter
Brevity in programming could be in part attributed to:
- early linkers which required variable names to be restricted to 6 characters to save memory. A later "advance" allowed longer variable names to be used for human comprehensibility, but where only the first few characters were significant. In some versions of BASIC such as TRS-80 Level 2 Basic, long names were allowed, but only the first two letters were significant. This feature permitted erroneous behaviour that could be difficult to debug, for example when names such as "VALUE" and "VAT" were used and intended to be distinct.
- early source code editors lacking autocomplete
- early low-resolution monitors with limited line length
- much of computer science originating from mathematics, where variable names are traditionally only a single letter
Letter case and numerals
conventions do not restrict letter case, but attach a well-defined interpretation based
on letter case. Some naming conventions specify whether alphabetic, numeric, or alphanumeric
characters may be used, and if so, in what sequence.
Multiple-word identifiers
A common recommendation is "Use meaningful identifiers." A single word may not be as meaningful, or specific, as multiple words. Consequently, some naming conventions specify rules for the treatment of "compound" identifiers containing more than one word.As most programming languages do not allow whitespace in identifiers, a method of delimiting each word is needed. Historically some early languages, notably FORTRAN and ALGOL, allowed spaces within identifiers, determining the end of identifiers by context. This was abandoned in later languages due to the difficulty of tokenization. It is possible to write names by simply concatenating words, and this is sometimes used, as in
mypackage
for Java package names, though legibility suffers for longer terms, so usually some form of separation is used.Delimiter-separated words
One approach is to delimit separate words with a nonalphanumeric character. The two characters commonly used for this purpose are the hyphen and the underscore ; e.g., the two-word name "two words
" would be represented as "two-words
" or "two_words
". The hyphen is used by nearly all programmers writing COBOL, Forth, and Lisp ; it is also common in Unix for commands and packages, and is used in CSS. This convention has no standard name, though it may be referred to as lisp-case or COBOL-CASE, kebab-case, brochette-case, or other variants. Of these, kebab-case, dating at least to 2012, has achieved some currency since.By contrast, languages in the FORTRAN/ALGOL tradition, notably languages in the C and Pascal families, used the hyphen for the subtraction infix operator, and did not wish to require spaces around it, preventing its use in identifiers. An alternative is to use underscores; this is common in the C family, with lowercase words, being found for example in The C Programming Language, and has come to be known as snake case. Underscores with uppercase, as in UPPER_CASE, are commonly used for C preprocessor macros, hence known as MACRO_CASE, and for environment variables in Unix, such as BASH_VERSION in bash. Sometimes this is humorously referred to as SCREAMING_SNAKE_CASE.
Letter case-separated words
Another approach is to indicate word boundaries using medial capitalization, called "camelCase", "Pascal case", and many other names, thus respectively rendering "two words
" as "twoWords
" or "TwoWords
". This convention is commonly used in Pascal, Java, C#, and Visual Basic. Treatment of initialisms in identifiers varies. Some dictate that they be lowercased to ease typing and readability, whereas others leave them uppercased for accuracy.Metadata and hybrid conventions
Some naming conventions represent rules or requirements that go beyond the requirementsof a specific project or problem domain, and instead reflect a greater
overarching set of principles defined by the software architecture, underlying programming language or other kind of cross-project methodology.
Hungarian notation
Perhaps the most well-known is Hungarian notation, which encodes either the purpose or the type of a variable in its name. For example, the prefix "sz" for the variable szName indicates that the variable is a null-terminated string.Positional notation
A style used for very short could be: LCCIIL01, where LC would be the application, C for COBOL, IIL for the particular process subset, and the 01 a sequence number.This sort of convention is still in active use in mainframes dependent upon JCL and is also seen in the 8.3 MS-DOS style.
Composite word scheme (OF Language)
IBM's "OF Language" was documented in an IMS manual.It detailed the PRIME-MODIFIER-CLASS word scheme, which consisted of names like "CUST-ACT-NO" to indicate "customer account number".
PRIME words were meant to indicate major "entities" of interest to a system.
MODIFIER words were used for additional refinement, qualification and readability.
CLASS words ideally would be a very short list of data types relevant to a particular application. Common CLASS words might be: NO, ID, TXT, AMT, QTY, FL, CD, W and so forth. In practice, the available CLASS words would be a list of less than two dozen terms.
CLASS words, typically positioned on the right, served much the same purpose as Hungarian notation prefixes.
The purpose of CLASS words, in addition to consistency, was to specify to the programmer the data type of a particular data field. Prior to the acceptance of BOOLEAN fields, FL would indicate a field with only two possible values.
Language-specific conventions
ActionScript
Adobe's Coding Conventions and Best Practices suggests naming standards for ActionScript that are mostly consistent with those of ECMAScript. The style of identifiers is similar to that of Java.Ada
In Ada, the only recommended style of identifiers isMixed_Case_With_Underscores
.APL
In APL dialects, the delta is used between words, e.g. PERFΔSQUARE. If the name used underscored letters, then the delta underbar would be used instead.C and C++
In C and C++, keywords and standard library identifiers are mostly lowercase. In the C standard library, abbreviated names are the most common, while the C++ standard library often uses an underscore as a word separator. Identifiers representing macros are, by convention, written using only uppercase letters and underscores. Names containing double underscore or beginning with an underscore and a capital letter are reserved for implementation and should not be used. This is superficially similar to stropping, but the semantics differ: the underscores are part of the value of the identifier, rather than being quoting characters : the value of__foo
is __foo
, not foo
.C#
naming conventions generally follow the guidelines published by Microsoft for all.NET languages, but no conventions are enforced by the C# compiler.The Microsoft guidelines recommend the exclusive use of only PascalCase and camelCase, with the latter used only for method parameter names and method-local variable names. A special exception to PascalCase is made for two-letter acronyms that begin an identifier; in these cases, both letters are capitalized ; this is not the case for longer acronyms. The guidelines further recommend that the name given to an
interface
be PascalCase preceded by the capital letter I, as in IEnumerable
.The Microsoft guidelines for naming fields are specific to
static
, public
, and protected
fields; fields that are not static
and that have other accessibility levels are explicitly not covered by the guidelines. The most common practice is to use PascalCase for the names of all fields, except for those which are private
, which are given names that use camelCase preceded by a single underscore; for example, _totalCount
.Any identifier name may be prefixed by the commercial-at symbol, without any change in meaning. That is, both
factor
and @factor
refer to the same object. By convention, this prefix is only used in cases when the identifier would otherwise be either a reserved keyword, which may not be used as an identifier without the prefix, or a contextual keyword, in which cases the prefix is not strictly required.Go
In Go, the convention is to useMixedCaps
or mixedCaps
rather than underscores to write multiword names. When referring to classes or functions, the first letter specifies the visibility for external packages. Making the first letter uppercase exports that piece of code, while lowercase makes it only usable within the current scope.Java
In Java, naming conventions for identifiers have been established and suggested by various Java communities such as Sun Microsystems, Netscape, AmbySoft, etc. A sample of naming conventions set by Sun Microsystems are listed below,where a name in "CamelCase" is one composed of a number of words joined without spaces, with each word's initial letter in capitals — for example "CamelCase".
Identifier type | Rules for naming | Examples |
Classes | Class names should be nouns in UpperCamelCase , with the first letter of every word capitalised. Use whole words — avoid acronyms and abbreviations. |
|
Methods | Methods should be verbs in lowerCamelCase or a multi-word name that begins with a verb in lowercase; that is, with the first letter lowercase and the first letters of subsequent words in uppercase. | run; runFast; getBackground; |
Variables | Local variables, instance variables, and class variables are also written in lowerCamelCase . Variable names should not start with underscore or dollar sign characters, even though both are allowed. This is in contrast to other coding conventions that state that underscores should be used to prefix all instance variables.Variable names should be short yet meaningful. The choice of a variable name should be mnemonic — that is, designed to indicate to the casual observer the intent of its use. One-character variable names should be avoided except for temporary "throwaway" variables. Common names for temporary variables are i, j, k, m, and n for integers; c, d, and e for characters. |
|
Constants | Constants should be written in uppercase characters separated by underscores. Constant names may also contain digits if appropriate, but not as the first character. | static final int MAX_PARTICIPANTS = 10; |
widget.expand
and Widget.expand
imply significantly different behaviours: widget.expand
implies an invocation to method expand
in an instance named widget
, whereas Widget.expand
implies an invocation to static method expand
in class Widget
.One widely used Java coding style dictates that UpperCamelCase be used for classes and lowerCamelCase be used for instances and methods.
Recognising this usage, some IDEs, such as Eclipse, implement shortcuts based on CamelCase. For instance, in Eclipse's content assist feature, typing just the upper-case letters of a CamelCase word will suggest any matching class or method name.
Initialisms of three or more letters are CamelCase instead of uppercase. One may also set the boundary at two or more letters.
JavaScript
The built-in JavaScript libraries use the same naming conventions as Java. Data types and constructor functions use upper camel case and methods use lower camel case. In order to be consistent most JavaScript developers follow these conventions.See also:
Lisp
Common practice in most Lisp dialects is to use dashes to separate words in identifiers, as inwith-open-file
and make-hash-table
. Dynamic variable names conventionally start and end with asterisks: *map-walls*
. Constants names are marked by plus signs: +map-size+
..NET
recommends UpperCamelCase, also known as PascalCase, for most identifiers. and is a shared convention for the.NET languages. Microsoft further recommends that no type prefix hints are used. Instead of using Hungarian notation it is recommended to end the name with the base class' name;LoginButton
instead of BtnLogin
.Objective-C
has a common coding style that has its roots in Smalltalk.Top-level entities, including classes, protocols, categories, as well as C constructs that are used in Objective-C programs like global variables and functions, are in UpperCamelCase with a short all-uppercase prefix denoting namespace, like NSString, UIAppDelegate, NSApp or CGRectMake. Constants may optionally be prefixed with a lowercase letter "k" like kCFBooleanTrue.
Instance variables of an object use lowerCamelCase prefixed with an underscore, like _delegate and _tableView.
Method names use multiple lowerCamelCase parts separated by colons that delimit arguments, like: application:didFinishLaunchingWithOptions:, stringWithFormat: and isRunning.
Pascal, Modula-2 and Oberon
Wirthian languages Pascal, Modula-2 and Oberon generally useCapitalized
or UpperCamelCase
identifiers for programs, modules, constants, types and procedures, and lowercase
or lowerCamelCase
identifiers for math constants, variables, formal parameters and functions. While some dialects support underscore and dollar signs in identifiers, snake case and macro case is more likely confined to use within foreign API interfaces.Perl
takes some cues from its C heritage for conventions. Locally scoped variables and subroutine names are lowercase with infix underscores. Subroutines and variables meant to be treated as private are prefixed with an underscore. Package variables are title cased. Declared constants are all caps. Package names are camel case excepting pragmata—e.g.,strict
and mro
—which are lowercase.PHP
recommendations are contained in PSR-1 and PSR-12. According to PSR-1, class names should be in PascalCase, class constants should be in MACRO_CASE, and method names should be in camelCase.Python and Ruby
and Ruby both recommendUpperCamelCase
for class names, CAPITALIZED_WITH_UNDERSCORES
for constants, and lowercase_separated_by_underscores
for other names.In Python, if a name is intended to be "private", it is prefixed by an underscore. Private variables are enforced in Python only by convention. Names can also be suffixed with an underscore to prevent conflict with Python keywords. Prefixing with double underscores changes behaviour in classes with regard to name mangling. Prefixing and suffixing with double underscores are reserved for "magic names" which fulfill special behaviour in Python objects.
R
While there is no official style guide for R, the tidyverse style guide from R-guru Hadley Wickham sets the standard for most users. This guide recommends avoiding special characters in file names and using only numbers, letters and underscores for variable and function names e.g. fit_models.R.Raku
follows more or less the same conventions as Perl, except that it allows an infix hyphen – or an apostrophe ' within an identifier, provided that it is followed by an alphabetic character. Raku programmers thus often use kebab case in their identifiers; for example,fish-food
and don't-do-that
are valid identifiers.Rust
recommendsUpperCamelCase
for type aliases and struct, trait, enum, and enum variant names, SCREAMING_SNAKE_CASE
for constants or statics and snake_case
for variable, function and struct member names.Swift
has shifted its naming conventions with each individual release. However a major update with Swift 3.0 stabilised the naming conventions forlowerCamelCase
across variables and function declarations. Constants are usually defined by enum types or constant parameters that are also written this way. Class and other object type declarations are UpperCamelCase
.As of Swift 3.0 there have been made clear naming guidelines for the language in an effort to standardise the API naming and declaration conventions across all third party APIs.