Build System (Prev) C++ Templates (Next)
This file documents conventions for names, code formatting and design decisions that should be used throughout the source code of PSCF:
File names and locations
- Names of header file end with the file name suffix *.h. Header files may be included into other files with few restrictions.
- Names of template implementation files use file name suffix *.tpp. These are files that contain definitions of function templates (either free function templates or class template member function) that are declared in an associated header file.
- Names of standard C++ source files that are compiled by the build system end with suffix *.cpp.
- Names of CUDA C++ source files that are compiled by the build system end with suffix *.cu.
- One class per file rule: Avoid placing definitions or definitions involving more than one public class or class template in a single file.
- The base name for all files associated with a single class or class template should be given by the name of the class or class template. The header and source files for a class named MyClass should thus be MyClass.h and MyClass.cpp. The name of a template implementation file associated with a class template named MyClass should be MyClass.tpp.
- Place all C++ and CUDA C++ files associated with a single class or class template in the same directory, within the src/ directory tree.
Symbol names and capitalization
- Names of functions and variables are lower case camel, as in "myVariable" or "myData".
- Names of all user-defined types (class, typedef, and enum types) and namespaces are upper case camel, as in "MyClass" or "Util".
- Names of private or protected class member variables must end with with a single trailing underscore, like "data_" or "outputFile_".
- Public static constant class member variable names are upper case camel, as in "MaxDimension".
- Use plural nouns for names of arrays and other containers. A private member variable that represents array in which each element is an object of type Monomer might thus be named "monomers_".
- Names of pointer variables end with a suffix Ptr. A local Thing* pointer variable within a function might thus be called "thingPtr". A Thing* pointer that is a private or protected class member variable might be called "thinkPtr_". Do not use this convention for pointers that point at C arrays (and avoid bare C-arrays): The suffix "Ptr" denotes a pointer that will point at a single object.
- Names of C/C++ preprocessor macros are upper case, with underscores between words. Full upper case names should be avoided for other symbols.
- Use the same parameter names in the declaration of a function and in the corresponding definition.
Header guards
Code format standards
Indentation and line wraps
- Indent exactly 3 spaces per level. Use only spaces for indentation: Do NOT ever use tabs for indentation.
- Break lines at less than approximately 76 characters per line whenever possible, to preserve readability in printouts and relatively small screens. Much of the code has been written using the vi editor with line number turned on with the command "set nu" on a screen with a total of 80 columns, and should fit within the remaining columns without line wraps when displayed in this way.
Namespace declarations
- All source code in the PSCF repository must be enclosed within the Pscf or Util namespace, or enclosed sub-namespaces of these.
- All source code must thus be enclosed in namespace declarations blocks. All or almost all of the code in one file is usually enclosed in a single namespace. Start this namespace declaration in the first column.
- To save horizontal space, namespace declarations for two or more nested namespaces that enclose most or all of the code in a file may each start in the first column. A file that contains code that is declared in namespace Pscf::Prdc may thus be enclosed in namespace declarations that are formatted as shown below:
namespace Prdc {
\\ Omitted code
for entities declared in
Pscf::Prdc
}
}
Periodic fields and crystallography.
PSCF package top-level namespace.
Class definitions
All guidelines give here apply to definitions of both classes and class templates:
- Begin definitions of public classes one indentation level (3 spaces) inside the enclosing namespace declaration.
- In class definitions, align "public:", "protected:", and "private:" declarations with the beginning of the class definition statement, and with the closing brace. List public members first, then protected, then private.
- Within the private block, list member variables first, then any private member functions.
- Within a protected block, list member functions first, then member variables. These gives a contiguous block of member function declarations spanning the public and protected blocks, and a contiguous block of member variables spanning the protected and private blocks.
- Do not define public member variables.
- List any friend declarations at the end of a class definition in a "pseudo-block" that is preceded by a comment "//friends:" on a line by itself, after the private members. The "//friends:" comment should be aligned with "public:" and "private:" declarations.
- Inline method definitions should usually be given outside the class definition, within the header file. The word "inline" should be added to this function definition, but not to the function declaration.
- Example (with doxygen documentation):
#ifndef UTIL_SILLY_CLASS_H
#define UTIL_SILLY_CLASS_H
namespace Util
{
/**
* A truly pointless class.
*/
class SillyClass : public Base
{
public:
/**
* The first method.
*
* \param param1 a globble
* \param param2 a gloob
*/
int method1(int param1, double param2);
/**
* Get buddy (by reference)
*/
const Buddy& buddy() const;
protected:
/**
* Get buddy by non const reference)
*/
Buddy& buddy();
private:
int data1_;
Buddy* buddyPtr_;
//friends:
friend class Buddy;
};
// Inline methods
inline const Buddy& SillyClass::buddy() const
{ return *buddyPtr_; }
inline Buddy& SillyClass::buddy()
{ return *buddyPtr_; }
}
#endif
Function definitions
These guidelines apply to both class member functions and free functions:
- Place one space after each comma in each function parameter list. Do not place any additional space immediately after the opening parentheses or before the closing parenthesis in a function parameter list.
- For functions with more than one line, put the opening brace on a separate line, and align opening and closing braces, like this:
int SillyClass::sillyMethod(int param1, int max)
{
for (int i = 0; i < max; ++i) {
param1++;
}
if (param1 > 0) {
return 0;
} else {
return 1;
}
}
- For one-line functions, the function definition may be given on a single line, like this:
inline int SillyClass::data()
{ return data_; }
- For control structures (if, for, while, etc.), place the opening brace at the end of a line, and the closing brace on a line by itself, aligned with the beginning of the opening line, like this:
for (int i = 0; i < end; ++i) {
doSomething();
}
- Usually set off the operators =, ==, <, >, +, and - by one space on either side, with occasional exceptions to avoid line wraps. Multipication (*) and division (/) operators may or may not be set off by white space.
- Use one space before the opening parentheses of a conditional statement in a control structure (if, for, while, etc) and one space after the closing parentheseses and before any opening brace.
- Do not follow opening parentheses or precede closing parentheses by a space in parentheses that enclose a function parameter lists or the condition for a control structures. Do not add whitespace space before commas or semicolons.
- Consecutive function declarations or definitions within a file, along with associated documentation blocks, should be separated by a single blank line.
Documentation (via doxygen)
PSCF uses the doxygen utility (www.doxygen.org) to create html documentation from documentations blocks that are embedded within the source code. Doxygen scans the source code and extracts lines of comments that are marked for extraction. Specifically, Doxygen extracts multi-line comment that begin with a slash and two asterisks ("/\*\*") and single line comments that begin with three slashes ("///"). See comments in the following example:
/**
* This comment will be extracted by doxygen and included in the
* web manual (note the extra asterisk)
*/
/// So is this one (note the extra slash)
/*
* This comment, however, will be ignored by doxygen.
*/
// This will also be ignored.
Comments within functions that you do not wish to be extracted by doxygen should use the usual form for C comments, using only a single asterisk for multi-line comments or only two slashes for for single line comments, as indicated in the above example.
Guidelines:
- Create doxygen documentation blocks for public and protected named quantities, i.e., all classes, public and protected member functions, protected member variables, namespaces, global functions, typedefs, enums, and constants.
- The doxygen documentation block for a class should appear immediately above the first line of the class definition.
- The doxygen documentation block for a class member function should be immediately above the first line of the function declaration. In the case of a class member function, this will appear with the class definition, in the associated header file.
- Prefer doxygen multiline comments for documentation of classes and class member functions.
- Document all parameters of public and protected functions, using the doxygen param keyword.
- The documentation for every class and public or protected function should begin with a brief single-sentence description, which must end with a period and be followed by a blank line. The brief description should usually not extend beyond one line. This brief description is often sufficient. If needed, more detailed discussion may be given in one or more subsequent paragraphs, separated by blank lines.
Example:
/**
* Align the universe.
*
* This is a longer discussion of what the method does, and of how and
* why it does it. It may also contain a discussion of assumptions, and
* of the algorithm.
*
* A longer discussion may contain two or more paragraphs, separated by
* blank lines.
*
* \param thing1 value of some quantity
* \param thing2 flag to determine what to do with thing1
* \return shift in the position of the universe
*/
double alignUniverse(double thing1, bool thing2);
- Private class member functions and member variables should also be documentated, but the use of multi-line or single-line doxygen form is optional for these class members. Documemtation of private members does not normally appear in the html documentation, but formatted documentation blocks can be included if desired by changing a setting in the configuration file for doxygen.
- Do not use doxygen-style comments for the function definitions that are given outside the class definition. Instead, use a conventional C/C++ comment format (with one asterisk) above each such function definition, like this:
void MyClass:myFunction(int param)
{
}
Header file includes
- Files that are define in the PSCF src directory tree may be included by giving the path relative to the src directory within angle brackets. For example, to include the header file src/util/param/ParamComposite.h, one could us an include directive
#include <util/param/ParamComposite.h>
This is possible because the PSCF build system adds the PSCF src directory to the list of directories that the preprocessor should search for included files.
- In header files, prefer use of forward declarations over header file inclusion. Forward declarations are sufficient for classes or class templates that used in a header file only in function declarations (as a function parameter or return type) or in the declaration of types for pointer member variables.
- Inclusion of header files is generally necessary in header files only for headers that define base classes or types for member objects, and for classes that are used in the implementation of inline functions.
- Use explanatory comments to document reasons for the inclusion of a header file includes rather than a forward declarations within a header file. Put a brief single-line explanatory comment at the end of each line that begins with a header file include directive within a header file. Such comments might say something like "base class", "member" or "inline function" to document the reason that inclusion of the associated header header file is necessary. This practice helps force the developer to think about whether a header file include could be replaced by a forward declaration, and documents their reasoning for others.
- Header files that are needed in definitions of non-inline class member functions but that are not needed in the class definition or inline member functions should usually be included in the file that contains contains definitions of the member functions, whenever this is separate from the header file, rather than in the class or class template header file. Their is no need to document reasons for including header files in such files.
- Header files that provide information that is required in a C++ file should usually be included explicitly, even if it is known that they would be indirectly included via another included header file. Reliance on indirect inclusion is fragile, and explicit inclusion helps document the dependency. Exceptions include:
-
A compilable source file MyClass.cpp or class template implementation file MyClass.tpp may rely on indirect inclusion of header files that are included by the corresponding header file MyClass.h
-
A derived class may sometimes rely on indirect inclusion of header files that are included by a base class (though there is no harm in instead explicitly including all required headers).
-
The util/misc/Exception.h header should generally be included indirectly by inclusion of src/util/global.h.
Data structures
- Use the C++ iostream classes for all file IO. For consistency, avoid the C fscan() and fprint() methods. Use the wrapper classes in the src/util/format directory as needed to simplify coding of formatted output in fields of specified width to ostreams.
- Use std::string to represent character strings whenever possible.
- Prefer the array container templates defined in the src/util/containers directory over bare C arrays and STL containers. The most commonly used container, named DArray, is a dynamically allocated array in which space for all elements is allocated at run time by calling a member function named "allocate". This preference for the home-brewed containers is based primarily on two considerations:
-
Our home-brewed containers provide optional bounds checking, which is enabled only when UTIL_DEBUG is defined. The standard library containers do not provide an equally convenient way to globally enable bounds-checking on to facilitate debugging during development and disable it to maximize performance in production code.
-
The most heavily used array containers treat allocation as a single event that can be performed once. In PSCF, most arrays are allocated during processing of the parameter file, before any computation is performed. This is the most convenient interface for memory management for scientific computations that involve involve large arrays that retain a fixed size after initialization, but this behavior is not provided by any the C++ standard containers.
- C++ STL containers such as std::vector should be used only when there is not an equivalent home-brewed container, or when use of standard library container simplifies access to standard library algorithms (e.g., to the sorting algorithm).
File I/O
- Use the C++ iostream classes for all file IO.
- Do not use the C fscan() and fprint() methods.
- Use the I/O wrapper classes in the src/util/format directory (Util::Int, Util::Dbl, Util::Str, etc) as needed for formatted output to ostreams in fields of specified width.
Use of C++ templates
- In code for periodic systems defined in namespaces Pscf::Prdc, Pscf::Rpc and Pscf::Rpg, the dimension D of space is always treated as a template parameter. As a result, most of the files in directories src/prdc, src/rpc, and src/rpg define class templates for which D is a template parameter.
- Specializations of class templates in which the dimension of space D is the only template parameter should be explicitly instantiated for the three possible valid values of D=1, 2, or 3. The header file for each such class should contain explicit instantiation declarations for these three specializations.
- Definitions of non-inlined member function of class templates are often placed in a template implementation file with file name suffix .tpp. In the case a standard template that is designed to allow implicit instantiation, this .tpp file is usually included by the associated header file, by an include statement placed near the end of the header file.
- See the pages about C++ templates for a detailed discussion of different usage patterns for class templates that are designed to be explicitly instantiated.
Error handling
- Use the UTIL_CHECK(...), UTIL_ASSERT(...), or UTIL_THROW(...) preprocessor function macros for error tests. If an error is encountered, these macros all print an error statement to a Log file that displays the file and line number location of the error macro, and then throw a Util::Exception. Exceptions thrown by these macros are generally not caught elsewhere, and thus cause execution to halt.
- All files that use these macros for error tests must include the header file "util/global.h" This file defines the UTIL_CHECK, UTIL_ASSERT, and UTIL_THROW macros, includes the "util/misc/Exception.h" and "util/misc/Log.h" header files that are used by these macros, and enables execution of optional debugging tests if the UTIL_DEBUG preprocessor macro is defined on entry to this file.
- The "setopts" script may be used to define or undefine the UTIL_DEBUG macro, and thus enable or disable conditional compilation of optional "debugging" tests. The command "./setopts -d0" comments out a line in a config.mk configuration file that defines UTIL_DEBUG, thereby disabling optional debugging tests, while "./setopts -d1" uncomments (restores) that line, and thus enables these optional tests.
- Do not use exceptions for control flow. Exceptions should be used only for unrecoverable errors, and should cause the program to terminate after printing an error message.
- Prefer use of UTIL_CHECK for most mandatory tests and UTIL_ASSERT for optional debugging tests. Reserve use of UTIL_THROW for rare cases for which a more detailed error message is required.
- The UTIL_CHECK(condition) macro takes a logical (bool) expression as an argument. If the expression is false, it prints an error message to the Log file and throws a Util::Exception object. The error message includes the text of the logical condition that failed, as well as a file name and line number of the UTIL_CHECK macro. This macro is always enabled, independent of the value of the UTIL_DEBUG macro, and so is used for mandatory tests.
- The UTIL_ASSERT(condition) is identical to UTIL_CHECK, except that this macro is enabled if and only if the UTIL_DEBUG macro is defined (i.e., only if optional debugging tests are enabled). The purpose and behavior are similar to those of the built-in C "assert" macro.
- THE UTIL_THROW(message) takes a character string containing an error message as an argument, and always prints and error message and then throws an exception. The error message contains the text of the macro argument, along with file name and line number information that is added automatically. Because this macro does not test a logical condition, and always throws an Exception, it normally must be placed inside the body of C/C++ if statement block that tests a logical error condition, so that it is executed only if an error is found. The UTIL_THROW macro may be preceded within the body of such an if statement block by statements that print out more detailed information about values of the variables and the nature of the error, to aid debugging. Like UTIL_CHECK, the UTIL_THROW macro is always enabled, whether or not the UTIL_DEBUG macro is defined. The UTIL_THROW can be used to output more detailed error messages than those produced by UTIL_CHECK, at the cost of more verbose code for error checking.
Function name guidelines
- Names of functions that are not simple accessors (discussed below) should usually be verbs.
- Names of functions that compute something often start with the word "compute"
- The name of a simple accessor ("getter") class member function that returns a non-public member variable by value or by reference should simply be the name of the member variable, without any trailing underscore. This same name convention is used for accessors that return by value, const reference, or non-const reference. Thus, for example:
int Thing::data() const
{ return data_; }
const Vector& Thing::position() const
{ return position_; }
Vector& Thing::position()
{ return position_; }
Molecule& Thing::molecule()
{ return *moleculePtr_; }
- The name of a "setter" class member function that is passed a value for a non-public member variable should begin with the prefix "set". The name of the function parameter that holds the new value should be the same as the name of the non-public member variable, without any trailing underscore. The same convention is used whether the value is passed by value or by reference, and is also used for functions that store either the address of an object or a copy of its value. For example:
void Thing::setData(int data)
{ data_ = data; }
void Thing::setPosition(Vector const& position)
{ position_ = position; }
void Thing::setMolecule(Molecule &molecule)
{ moleculePtr_ = &molecule; }
- Class member functions that are used to create associations between objects, by storing the address of an associated object in a pointer member variable, may use names that start with the prefix "set".
Function interface guidelines
- Prefer passing and returning primitive C/C++ data types by value. Pass primitive data types to functions by non-const reference only when they must be modified within the function.
- Prefer passing objects (class instances) to functions by reference, not by value. Pass by const reference if the object is not modified.
- Prefer references over pointers as function parameters. Pass pointers to functions only if: i) a null value for the pointer is a meaningful possibility, or ii) the pointer contains an address that must be re-assigned within the function.
- Prefer references over pointers as function return values when returning a handle to a persistent object to provide access without delegating ownership (i.e,. responsibility for destruction).
- Factory functions that create a new object and then relinquish ownership return a pointer to the newly created object, and return a nullptr if allocation fails.
Class interface guidelines
- Make all nonstatic class member variables private or (less frequently) protected. Prefer private variables over protected whenever possible.
- Practice strict "const correctness". Mark function parameters and class member functions as const whenever possible.
- Class member functions should be marked const if they do not change the logical state of the object.
- If a member function does not modify the logical state of an object in a manner that is detectable from outside the class, but modifies an private member variable that is being used as, e.g., work space or an internal bookkeeping device, it may sometimes be appropriate to declare that private variable to be "mutable" in order to be able to declare the member function to be "const".
- Read-only access to a member variable of primitive C/C++ type should be provided (when needed) by an accessor function that returns the member variable by value. Read-only access to an object that is an instance of a class may be provided by an accessor function that returns the object by const reference.
- Simple accessors functions that provide read-only access to a member variable value or by const reference should be declared as const functions.
- Read-write access to a member of a class may be provided by an accessor that returns the object by non-const reference. Such read-write accessors should not be marked const, since they provide access that allows external code to modify the state of a member variable.
- Example: If a class has an int member variable data_ and a member object_ that is an instance of class Object, you might consider providing any or (none) of the following member accessor functions:
int Thing::data() const
{ return data_; }
const Object& Thing::object() const
{ return object_; }
Object& Thing::&object_()
{ return object_; }
- Providing a non-const accessor function that returns a non-const reference to a member variable is equivalent to making the member public, and may be used when this is the desired behavior. Do not instead simply make a data member a public member.
- An accessor that returns a non-const reference to a member variable provide pseudo-public access to that members. The advantages of providing such a function when this is the desired behavior over simply making that data member public are:
-
Uniform interface: If members can only be accessed through accessor functions, users do need not remember which members are public and which are accessed through accessor functions.
-
Uniform name conventions: The name of the accessor that returns a member is always the name of the member variable, without amy underscore. This allows us to use an underscore to mark member variable names, without exposing underscored names outside the class implementation.
-
Uniform access: The same name convention is used for accessors that return by value, const reference, or reference, but the compiler can still enforce access control.
-
Implementation hiding: The same interface convention is used for a function that accesses a member variable of a class and one that access a dynamically allocated "pseudo-member" object through a pointer member variable. The choice of whether to use a member variable or a pointer is thus hidden from users of the class.
-
Error checking: Consistent use of accessor functions make it convenient to add and remove mandatory or optional sanity checks as desired, such as a checks that a pointers is not null before it is de-referenced.
- Write access to a member variable may sometimes instead be provided by an explicit "set" function, which is a void function for which the name begins with the prefix "set" and for which a value or object is passed as an argument. Such "set" functions and accessors that return non-const references thus provide two alternate ways of granting write access to a member variable.
- A void "set" function is often used in preference to an accessor that returns a non-const reference if the function has either of the following two purposes:
-
Setting a simple data value : Setting a value for a member variable that is an instance of a primitive C/C++ types, an enumeration, or some very simple class types.
-
Creating an association : In this case, an association with an object that is passed as an argument is created by storing the address of that object in a pointer member variable.
In these two cases, the use of the "set" prefix documents intent, making it clear that the purpose is to set a value or create an association by setting an address value.
- An accessor that returns a member variable by non-const reference is usually preferred over a void "set" function when the relevant member variable is an instance of a class that defines member functions, and when users are likely to use the accessor to invoke these member functions. The accessor syntax is convenient for this purpose because it allows calls to member functions to be chained, allowing statements like this: Use of the accessor syntax thus provides a convenient syntax for accessing members of objects that are nested within an ownership hierarchy, while allowing the same syntax to be used to invoke const and non-const member functions of nested objects.
Build System (Prev) Developer Information (Up) C++ Templates (Next)