PSCF v1.3.2
|
Many of the SCFT iteration algorithms and PS-FTS compressor algorithms used by PSCF are based on some variant the Anderson mixing algorithm. Anderson mixing (AM) is a general stategy for iteratively solving a nonlinear system of equations by taking advantage of information obtained during several previous iterations. PSCF C++ classes that use this strategy are all derived from a class template named Pscf::AmIteratorTmpl that provides data structures and algorithms required in all variants of this strategy.
Consider a system of \( N \) nonlinear equations that must be solved for \( N \) variables. Let \( {\bf X} \) denote a vector of unknown variables with elements \( X_{0}, \ldots, X_{N-1} ) \). We will refer to this vector of variables as a state vector. We assume without loss of generality that the problem is expressed as a system of equations of the form
\[ R_{i}({\bf X}) = 0 \]
for all integer \( i = 0, \ldots, N-1 \). Here, \( R_{i} \) denotes a single element of a residual vector \( {\bf R} \) with \( N \) elements \( (R_{0}, \ldots, R_{N-1} )\), such that each element of this vector is a function of the full state vector \( {\bf X} \).
To describe a sequence of states generated by an iterative algorithm, let \( {\bf X}^{k} \) and \( {\bf R }^{k} \) denote values of the state vector and residual vector, respectively, at the end of iteration step \( k \), for \( k = 1, 2, \ldots \). Let \( {\bf X}^{0} \) and \( {\bf R}^{0} \) denote initial values, before the first iteration.
Each iteration of an AM algorithm consists of two steps, which we refer to as projection and correction. The projection step uses information from previous iterations to produce an intermediate state, and the correction step attempts to further reduce the error remaining in this intermediate state. To describe this, let \( \underline{{\bf X}}^{k} \) denote the intermediate value of the state vector obtained after the projection step of iteration \( k \) but before the correction step.
We projection step uses information from previous states, and relies on and that the initial state vector is close enough to a solution that the function \( {\bf R}({\bf X}) \) is nearly linear. For each iteration \( k > 1 \), the projection step uses stored values of values of \( {\bf X}^{k} \) and \( {\bf R}^{k} \) at the current state and at \( K \) previous states to construct an intermediate trial state vector \( \underline{{\bf X}}^{k+1} \) as a linear superposition of the form
\[ \underline{{\bf X}}^{k+1} = {\bf X}^{k} + \sum_{j=0}^{K-1} ( {\bf X}^{k-j} - {\bf X}^{k-j-1} ) C_j \quad, \]
in which \( C_{0}, \ldots, C_{K-1} \) are real coefficients that will be optimized to minimize the norm an estimate of the resulting residual vector. The parameter \( K \) grows by 1 after each iteration after the first until it reaches a user-defined integer parameter \( N_{\rm h} \) that gives the maximum allowable number of previous states. A predicted value of the residual vector in this trial state, denoted by \( \underline{{\bf R}}^{k} \), is obtained by assuming linearity, giving a predicted residual vector
\[ \underline{{\bf R}}^{k+1} ={\bf R}^{k} + \sum_{j=0}^{K-1} ({\bf R}^{k-j} - {\bf R}^{k-j-1} ) C_j \quad, \]
that is given by a corresponding linear superposition of previous residual vector values. The values of the coefficients \( C_{1}, \ldots, C_{K-1} \) are chosen so as to minimize the \( l_{2} \) norm of the predicted residual \( \underline{{\bf R}}^{k} \). This corresponds to minimization of the linearized residual within a \( K \) dimensional subspace for the difference \( \underline{{\bf R }}^{k+1} -{\bf R}^{k} \) that is spanned by a basis of \( K \) differences between values of \( {\bf X} \) in previous states. We refer to this as "projection" because it yields a change in the state vector that has been projected into this relatively low dimensional subspace.
As part of the the projection step discussed above, we must compute the list of coefficients that minimize the norm of the residual. This is a standard least-squares problem. To discuss this, we define basis vectors
\[ {\bf Y}_{j} \equiv {\bf R}^{k-j} - {\bf R})^{k-j-1} \]
for \( j = 0, \ldots, K - 1 \), and express \( \underline{\bf R}^{k+1} \) as a sum
\[ \underline{{\bf R}}^{k+1} = {\bf R}^{k} + \sum_{j=0}^{K-1} C_j {\bf Y}_{j} \quad. \]
The coefficients that minimize \( |\underline{\bf R}^{k+1}|^{2} \) must satisfy the criteria
\[ 0 = \frac{\partial |\underline{\bf R}^{k}|^{2}}{\partial C_{i}} = 2 \frac{\partial \underline{\bf R}^{k}}{\partial C_{i}} \cdot \underline{{\bf R}}^{k} = 2 {\bf Y}_{i} \cdot \underline{{\bf R}}^{k} \]
for all \(i = 0, \ldots, K - 1 \). Expanding the resulting expressions yields a linear system of equations of the form
\[ \sum_{j=0}^{K-1} U_{ij}C_{j} = -V_{i} \]
in which we have defined a \( K \times K \) square matrix \( U \) and a column vector \( V \) with elements
\[ U_{ij} \equiv {\bf Y}_{i} \cdot {\bf Y}_{j} \quad\quad V_{i} = {\bf Y}_{i} \cdot {\bf R}^{k-1} \]
for \( i, j = 0, \ldots, K-1 \). The vector of coefficients that solves this problem yields a minimal value of the predicted residual \( \underline{{\bf R}}^{k+1} \) that is orthogonal to the \( K \) dimensional subspace spanned by \( {\bf Y}_{0}, \ldots {\bf Y}_{K-1} \). Once they are known, these coefficients can also be used to compute trial state vector \( \underline{{\bf X}}^{k+1} \).
The correction step adds an additional correction vector, denoted by \( {\bf D}^{k} \), to obtain a new state
\[ {\bf X}^{k+1} = \underline{{\bf X}}^{k+1} + {\bf D}^{k+1} \]
The value of the correction vector \( D^{k} \) generally depends upon the residual vector \( \underline{{\bf R } }^{k+1} \) that is predicted to remain after the projection step. The projection vector generally does not lie within within the \( K \) dimensional subpace in which the projection step attempts to minimize the residual.
The correction step proposed in the original Anderson mixing algorithm uses a correction vector that is simply proportional to the predicted residual, giving
\[ {\bf D}^{k+1} = \pm \lambda \underline{{\bf R }}^{k+1} \quad, \]
where \( \lambda \) is user specified "mixing" parameter. By default, the Pscf::AmIteratorTmpl class template uses a slightly more complicated correction vector of the form
\[ {\bf D}^{k+1} = \alpha \lambda \underline{{\bf R }}^{k} \lambda = \]
in which \( \alpha \) is a dimensionless "ramp factor" that increases over the first few steps of iteration and approaches \( 1 \) after an initial ramp-up period. This quantity is given by
\[ \alpha = 1 - r^{K+1} \]
for \( K < N_{\rm h} \) and \( \alpha = 1 \) otherwise, where \( K \) is the current number of basis vectors, \( r \) is a parameter in the range \( r \in [0, 1] \). The Pscf::AmIteratorTmpl temnplate uses default values of
\[ \lambda = 1 \quad\quad r = 0.9 \]
if the user does not explicitly set values for these parameters. The implementation provided by this template also allows the user The default version of this algorithm also allows the user disable use of the ramp parameter (thus setting \( \alpha = 1 \) for every iteration), by setting a bool parameter named useLambdaRamp to false (or 0).
The descriptions of the problem statement and the AM algorithm given above are intentionally quite general. In PSCF, classes that are derived from the Pscf::AmIteratorTmpl class template are used to implement both SCFT iteration algorithms and PS-FTS compressor algorithms. SCFT iterators for periodic systems can be used either to solve the SCF equations in a fixed unit cell (which requires adjust of a set of w fields associated with different monomer types) or to simultaneously solve the SCF equations and optimize the unit cell parameters. PS-FTS compressor algorithms adjust a single pressure-like chemical potential field so as to satisfy a self-consistent field incompressibility constraint. These different types of problem require different definitions of the residual and state vectors, with different numbers of elements.
Different algorithms for solving the same problem may also define the residual vector different ways. The field equations that must be solved in a SCFT problem can be written in a variety of equivalent ways that correspond to different was of defining a residual vector. Iterator algorithms for periodic structures may also define elements of the residual vector either in real space, by associating each element with a specific node on a computational grid, or in Fourier space, by instead associating each element of this vector with a specific basis function in a symmetry-adapted Fourier series.
To completely describe an algorithm that uses the AM strategy described above, one must thus specify:
The design of the AmIteratorTmpl class template separates general and specific aspects of an AM algorithm, in part, by giving responsibility for operations that depend on the nature of the problem to a set of pure virtual functions that must be implemented by subclasses. Among these are a virtual functions that computes the value of the residual vector, and another that defines the correction vector \( D \) used during the update step.
The AM iteration loop defined by the AmIteratorTmpl template terminates either: (1) when the number of iterations reaches some user-specified maximum value, or (2) when the value of a scalar error \( e({\bf R}) \) becomes less than a user-specified error tolerance \( \epsilon \). The value of the error tolerance is always given by a parameter named "epsilon" in the parameter file. The value of the scalar error is generally defined as a norm of the residual vector, as discussed below.
Users may usually choose from among several possible definitions of the erorr \( e \) by assigning one of the following allowed values to a string parameter named "errorType":
\[ e = |{\bf R}| = \left ( \sum_{i=0}^{N-1} R_{i}^{2} \right)^{1/2} \]
\[ e = \frac{|{\bf R}|}{\sqrt{N} } = \left ( \frac{1}{N}\sum_{i=0}^{N-1} R_{i}^{2} \right)^{1/2} \]
\[ e = \frac{|{\bf R}}{|{\bf X}|} \quad, \]
as in Stasiak and Matsen, Eur. Phys. J. E 34 , 110 (2011).An iteration algorithm successfully converges if it terminates with \( e({\bf R}) < \epsilon \) in fewer than the maximum allowed number of iterations. The choice of an appopriate value for the error threshhold \( \epsilon \) (or "epsilon") may depends on both the way that the residual vector is defined and the choice of an error type.
The parameter file formats for Iterator and compressor classes that are derived from the Pscf::AmIteratorTmp class template usually accept values for the following parameters, possibly among others:
Label | Type | Description |
epsilon | real | Desired tolerance for convergence - iteration stops if the magnitude \( e \) of the error becomes less than epsilon. (Required) |
maxItr* | int | Maximum number of iterations that will be attempted . (Optional, 200 by default) |
maxHist* | int | Maximum number of previous trial solutions (N_h) that will be retained in memory for use by the AM algorithm . (Optional, 50 by default) |
verbose* | int | Integer level 0, 1, 2 for verbosity of log output during iteration, with 0 being most concise. (Optional, and 0 by default) |
errorType* | string | Identifer for the type of variable used to define scalar error . The only allowed values are "norm", "rms" "max", and "relNorm", as discussed below. (Optional, equal to "relNorm" by default) |
lambda* | real | Mixing parameter in the standard Anderson mixing correction step. (Optional, 1.0 by default) |
useLambdaRamp* | bool | If true (1), use the ramp factor \( \alpha = 1 - r^{K+1} \) while K < maxHist, \( \alpha = 1 \) thereafter. If false (0), set \( \alpha = 1 \) for all iterations (Optional, true by default) |
r* | real | Ratio r used to compute factor \( \alpha = 1 - r^{K+1} \) while K < maxHist if the ramp factor is enabled. (Optional, 0.9 by default) |
Comments: