2.2 Compiling (Prev) 2.4 Parameter Files (Next)

This section describes how to execute the simpatico programs mcSim, mdSim and ddSim. The command line interface is similar for all three programs. Differences among the three programs are discussed explicitly below. This page gives instructions for using any of these three programs to simulate a single physical system. Multi-system simulations, such as those used for replica exchange algorithms, are discussed separately (see 2.11 Multi-System Simulations).

See also: Programs

Input files

Every simpatico simulation requires three types of input file:

a parameter file
a command file
a configuration file

The contents and formats of these different types of file are discussed briefly below and in more detail in several separate pages (see 2.4 Parameter Files , 2.5 Command Files and 2.6 Configuration Files). All three programs use similar formats for the parameter and command files. Each program has a default configuration file format, but can also read and write several other formats.

When a program simulation is executed, the parameter file is read first. The parameter file is used to initialize the state of the program and allocate memory. The name of the parameter file is specified when a program is invoked as a argument of a command line option.

The command file is read and interpreted after the parameter file. The command file is a script that contains a list of commands that are interpreted and executed in sequence. This script controls the program flow after initialization. The name of the command file is specified as an argument of a command line option.

The command file for a simple simulation would normally contain a command to read a specific input configuration file, a command to run a simulation of a specified number of steps, a command to write the final configuration to a specific file, and a command to halt execution. The name of the input configuration file must thus be contained within the command file.

Running a simulation

The names of the parameter and command files required to run a simpatico simulation are passed to the program as arguments of the "-p" and "-c" command line options, respectively.

To run a single-processor simulation, one thus invokes the mdSim or mcSim executable with the -p and -c options followed by appropriate input file names. For example, to run the default version of the single-processor mdSim MD simulation, using a parameter file named "param" and command file named "commands" that are both in the current working directory, one would enter:

mdSim -p param -c commands

To run a single-processor MC simulation using input files with the same names, one would instead enter

mcSim -p param -c commands

During execution, some log information is written to standard output (i.e., the screen). This log output can be directed to a file using, for example,

mdSim -p param -c commands > log

Standard output should normally be redirected to a file when a job is run in background or in a queue.

Running a parallel MD simulation

The syntax for executing the parallel ddSim molecular dynamics program is similar to that used for mdSim or mcSim, except for the need to launch the program as a multi-processor MPI program. The syntax for launching an MPI program depends on the system configuration and choice of library, but usually involves a script called "mpirun". Normally mpirun accept an option -np that takes the required number of processors as a argument.

To use mpirun to start a ddSim program on 32 processors, using parameter and command files in the working directory named param and commands, and redirect log output to a file named log, one would enter

mpirun -np 32 ddSim -p param -c commands > log

For a ddSim job, the number of processors declared in the command line (e.g., 32 in this example) must be consistent with the number of processors in the processor grid defined in the Domain block of the parameter file (e.g., a 4 x 4 x 2 grid for a simulation with 32 processors).

The "echo" option

All simpatico programs accept a command line option "-e" that causes the contents of each line of the parameter file to be echoed as the file is being read. For example, to invoke a "ddSim" simulation of 32 processors with echoing to a log file, one could enter

mpirun -np 32 ddSim -e -p param -c commands > log

The same option is also accepted by mdSim and mcSim. This option is useful for locating errors in the parameter file, because the echoed output ends immediately before the line at which an error in the parameter file is detected, and is followed by an error message that explains the nature of the error.

Other available command line options are discussed below.

Program structure

To understand the program flow, it may help to look briefly at the structure of a main program, which is similar for all of the simpatico simulation programs. The main program files for single processor MC and MD simulations are src/mcMd/mcSim.cpp and src/mcMd/mdSim.cpp. The main program for parallel MD simulation is src/ddMd/ddSim.cpp. All three programs have a similar structure.

Here is a slightly simplified version of the main program src/mcMd/mcSim.cpp for MC simulations:

int main(int argc, char** argv)
{
   McMd::McSimulation simulation;
  
   \\ Process command line options
   simulation.setOptions(argc, argv);
  
   \\ Read the parameter file8
   simulation.readParam();
  
   \\ Read the command file
   simulation.readCommands();
  
}

This simplified (but functional) version of the program has only four executable lines:

The first line creates an object named "simulation" that is an instance of the class McMd::McSimulation. An McMd::McSimulation object represents a complete Monte Carlo (MC) simulation.
The second line invokes the McMd::McSimulation::setOptions() member function. This function processes any command line options that are passed to the program. Some of the available command line options are discussed below.
The third line invokes the readParam() member function. This function reads the parameter file that was specified as an argument to the -p command line option and initializes the simulation.
The fourth line invokes the readCommands() member function. This function reads the command file that was specified as an argument to -c command line option and executes each command in sequence, until it encounters a command that tells it to stop execution.

The analogous code for the single-processor MD (mdSim) and parallel MD (ddSim) programs are almost identical. The only difference is that the main object in mdSim.cpp is an instance of class McMd::MdSimulation, and the main object in ddSim.cpp is an instance of class DdMd::Simulation. McMd is the name of a C++ namespace that contains code for single-processor MC and MD simulations. DdMd is a namespace that contains code for parallel domain decomposition MD simulations.

Parameter file

The parameter file contains all the data required to initialize a simulation, i.e., to initialize all variables to valid values, allocate all required memory, and leave the main simulation object ready for use. These data includes:

Parameters that control how much memory should be allocated for use during a simulation, such as the maximum allowable number of atoms or molecules.
Physical parameters such as the temperature and potential energy parameters.
Choices of specific simulation algorithms, such as a set of Monte Carlo moves for a MC simulation or the integration algorithm for an MD simulation, and parameters required by these algorithms (e.g,. the time step for an MD integrator).
Specification of any data analysis and/or data output operations that should be executed during execution, for "on-the-fly" analysis.
File name prefixes for input and output files.

In mcSim and mdSim simulations, the parameter file also contains a description of the chemical structure (topology) of every molecular species that can appear in the simulation. In ddSim simulations, this information about molecular structure is instead embedded in the configuration file (discussed below). Examples of parameter file formats for different types of simulation are shown and discussed here.

Command file

The command file contains a sequence of commands that are read and executed in the order they appear. Each line of the command file starts with a capitalized command name, followed by zero or more arguments. The minimal command file for an mcSim simulation looks like this:

READ_CONFIG       in
SIMULATE          100000
WRITE_CONFIG      out
FINISH   

This file instructs the program to read a configuration file named "in", run a simulation of 100000 attempted MC moves, write a final output configuration file to a file named "out", and then stop. The command file is read by a loop that terminates when a line containing only the command FINISH is encountered. A full list of valid commands and their arguments is given here.

The relative paths to input and output configuration files are constructed by concatenating input and output prefix strings (named inputPrefix and outputPrefix) that are specified in the parameter file to file names specified as arguments of the READ_CONFIG and WRITE_CONFIG commands in the command file. By convention, the values of the inputPrefix and outputPrefix strings are often set to directory name strings, which end with a directory separator symbol "/", in order to place input and output files in different directories. Thus, for example, if the parameter file sets inputPrefix=in/ and outputPrefix=out/, then the command "READ_CONFIG config" would read the configuration file in/config, while "WRITE_CONFIG config" would write to the file out/config.

Configuration file

A configuration file specifies atomic positions and other characteristics of a system that can change during the course of a simulation. Slightly different default file formats are used by mcSim, mdSim, and ddSim, each of which can also read and write configuration files using a several non-default file formats.

The default configuration file format for mcSim contains the dimensions of the periodic system boundary, the number of molecules of each molecular species, and the positions of all atoms. The default configuration file format for mdSim also contains atomic velocities. The default file formats for these two programs do not contain any information about molecular topology (i.e., which atoms are connected to which), because this is already specified in the param file formats for these programs.

The default configuration file format for ddSim contains periodic boundary dimensions, atomic positions and velocities. Unlike the configuration file formats for mcSim and mdSim, this format also contains structural information about topology. This is given by listing ids for atoms involved in all bond (2-body), angle (3-body) and dihedral (4-body) covalent groups.

Configuration file formats for all programs are described in more detail here.

Output files

Each simulation writes a limited amount of log information to report the progress of the simulation and to summarize statistics at the end of a simulation. In simulations of a single system, this is log output written to linux standard output. In an interactive job, it will thus appear on the user's screen unless redirected to a file.

A variety of other output files may also be written by classes that implement on-the-fly statistical analyses or data output operations. These data analysis and data output classes are all subclass of either the McMd::Analyzer (for mcSim or mdSim) or DdMd::Analyzer (for ddSim) base classes, and will thus be referred to generically as analyzer classes. The parameter file for each type of simulation specifies a set of analyzer objects that should used during a simulation. Each such object carries out a computation and/or outputs data to file at a specified interval. Most analyzer classes write output to one or more separate files. Generally, a different output file is used for each analyzer object. The path to each such output file is constructed by prepending a common string variable, named outputPrefix, to a base name that is specified in the parameter file block associated with a specific Analyzer. The convention of setting outputPrefix to a directory name, such as "outputPrefix=out/", thus automatically puts all analyzer output files in a single directory.

The list of analyzers in a parameter file may contain one that specifies that the system configuration be dumped to file periodically during the course of the simulation for later post-processing.

Command line options

All simpatico simulations of single systems can be invoked with any of the following command line options:

-e: Activates echoing of the parameter file to standard output.
-p filename: Specifies the name of a parameter file
-r filename: Restarts and continues a previous simulation.
-p filename: Specifies the name of a parameter file
-c filename: Specifies the name of a command file
-i filename: Specifies a prefix string for input data files
-o filename: Specifies a prefix string for output data files

The -e (echo) option causes each parameter in the parameter file to be echoed to standard output immediately after it is read. This option takes no arguments.

The -r (restart) option takes a required parameter "filename". This is the name of a restart file produced during a previous simulation. A simulation can be restarted from the most recent restart (i.e,. checkpoint) file to complete a simulation that crashed due to hardware failure or expiration of a wall clock limit, or to extend a simulation that completed normally. Restarting is discussed in more detail here.

The -p (parameter) option takes a required string parameter, which is the name of a parameter file. The -p and -r options are incompatible: The simulation must be initialized with either a parameter or a restart file, but not both.

The -c (command) option takes a required string parameter, which is the name of a command file.

If this file name is not specified as a command line option, it can be specified as the first argument of the FileMaster parameter block in the parameter file.

The -i (input prefix) option takes a required string parameter, which is a prefix that will be prepended to the names of all input data files.

The -o (output prefix) option takes a required string parameter, which is a prefix that will be prepended to the names of all output data files.

Other command line options that are relevant only to multi-system simulations are discussed separately here.

Specifying File Names and Path Prefixes

When a simulation program is executed, several file names or file paths must provided before the simulation can run. One of these must be provided as a command line argument, while others can provided either as command line arguments or in the parameter file. When a program is executed, the name of either a parameter file or a restart file (but not both) must be specified as an argument of the -p or -r command line option. In addition, the program needs the name of a command file and (optionally) values for the input and output prefix strings that are prepended to names of input and output files. The name of an input configuration file (excluding the input prefix, if any) must be specified in the command file, as an argument to a READ_CONFIG command.

The arguments of the -c, -i and -o command line options are all strings that represent file names or path prefixes, and that can be specified either as command line option arguments or as parameter in a parameter file. Specifically, the name of the command file can be specified either as an argument of the -o command line option or as the value of an optional commandFile parameter in the FileMaster block of a parameter file. Similarly, the input and and output prefix strings that are prepended to names of input and output data files can be specified either as arguments of the -i or -o command line options or as values of optional inputPrefix and outputPrefix parameters in the parameter file. If a program is restarted, by invoking the program with the -r option, values of the command file name and input and output prefixes that were used in the previous simulation are also contained in the restart file.

If a value for a command file name, input prefix or output prefix is provided both as a command line option argument and in a parameter or restart file, the following precedence rules apply:

If a simulation program is initialized from a parameter file, by invoking the executable with the -p option, a value for any one of these three strings that is provided in the parameter file will be used in preference to a value provided as a command line argument. If a value for one of these strings is given both in the command line and in the parameter file, the value given in the command line will be silently ignored. When a program is invoked with the -p option, a value provided on the command line will thus be used if and only if no corresponding value is given in the parameter file. If one wishes to use a command line option to specify a value for one of these strings, one must thus use a parameter file that does not contain a value for the corresponding parameter.

If a simulation is restarted, by invoking the program with the -r option, however, a command file name or prefix string given as a command line option will instead be used in preference to a value given in the parameter file. In this case, if a value for one of these strings is given in the command line, the value that was used in the previous simulation will be replaced and silently ignored.

The precedence given to path names provided as command line options when a program is restarted is necessary in order to allow a user to redirect output files to a different directory than that used in the original simulation. Analyzers that create data files during a simulation generally open new files with names that begin with the output prefix when a program is restarted. When one restarts a simulation by invokes a program with the -r option, the -o option should normally be used in the restarted program to redirect output of the restarted simulation to a different output directory than that used in the original run. If the output is not redirected in this way, output files that are created during the restarted will be placed in the same locations as those created during the original run, and will thus overwrite files created during the original run. See the discussing of restarting for a more complete discussion.

2.2 Compiling (Prev) 2 User Guide (Up) 2.4 Parameter Files (Next)