\(\renewcommand{\AA}{\text{Å}}\)
4.8.13. Writing new compute styles
Compute styles are used to calculate global or per-atom scalar, vector,
or array quantities or local or per-chunk or grid based properties from
the current state of the simulation. Examples include temperatures,
pressures, kinetic energies, and user-defined quantities. All compute
styles are derived from the Compute base class, which is defined in
src/compute.h. An overview of the available methods and flags is
given on the page for modifying or extending compute styles.
Computes are executed on demand and thus require a “consumer” like fix ave/time, compute reduce, or a custom thermo style or custom dump style. This is different from Writing a new fix style which are executed at pre-determined steps during a run and either at every steps or with a set frequency. Some compute instances are also created and used internally, e.g. for thermo output. It should be noted that some computes do not actually perform computations (e.g. computes for potential energy or virial) but rather suitable flags are set and then the corresponding data is collected by the force computation as needed and then the compute merely makes that pre-collected data available. Such computes may not be executed between runs but only during a run (or minimization).
In general, new compute styles should be added to the EXTRA-COMPUTE package. It is usually not necessary to create accelerated compute styles except for the KOKKOS package where there is a significant benefit to avoid time consuming data transfers between GPU and host. If you feel that your contribution should be added to a different package, please consult with the LAMMPS developers first. The contributed code needs to support the traditional GNU make build process and the CMake build process.
4.8.14. Case 1: A global scalar compute
In this section we describe how to implement a compute that produces a
single global scalar value. As a concrete example we use compute
temp, which computes the kinetic temperature of a group
of atoms. The full implementation can be found in
src/compute_temp.cpp and src/compute_temp.h.
Header file
Every compute style must be registered in LAMMPS by including the following lines after the copyright block and before the include guards for the class definition:
#ifdef COMPUTE_CLASS
// clang-format off
ComputeStyle(temp,ComputeTemp);
// clang-format on
#else
The block between #ifdef COMPUTE_CLASS and #else will be
included by the Modify class in modify.cpp to build a map of
factory functions. The map connects the style name temp with the
class name ComputeTemp. During compilation, LAMMPS constructs a
file style_compute.h that #includes all installed compute style
headers. The // clang-format comments prevent the clang-format
tool from incorrectly inserting spaces inside the macro arguments.
The class definition:
#ifndef LMP_COMPUTE_TEMP_H
#define LMP_COMPUTE_TEMP_H
#include "compute.h"
namespace LAMMPS_NS {
class ComputeTemp : public Compute {
public:
ComputeTemp(class LAMMPS *, int, char **);
~ComputeTemp() override;
void init() override {}
void setup() override;
double compute_scalar() override;
void compute_vector() override;
protected:
double tfactor;
virtual void dof_compute();
};
} // namespace LAMMPS_NS
#endif
#endif
The class derives from Compute and overrides the pure virtual method
init() (here with an empty inline body since there is no
initialization needed for this example) as well as several optional
methods. All overriding methods carry the override keyword.
Implementation file
Constructor
The constructor calls the base class constructor and sets the flags that tell LAMMPS what kind of output this compute produces:
ComputeTemp::ComputeTemp(LAMMPS *lmp, int narg, char **arg) : Compute(lmp, narg, arg)
{
if (narg != 3) error->all(FLERR, "Illegal compute temp command");
scalar_flag = vector_flag = 1;
size_vector = 6;
extscalar = 0;
extvector = 1;
tempflag = 1;
vector = new double[size_vector];
}
The flags have the following meanings:
scalar_flag = 1: this compute provides a global scalar viacompute_scalar().vector_flag = 1: this compute also provides a global vector viacompute_vector().size_vector = 6: the vector has 6 components (the six components of the kinetic energy tensor).extscalar = 0: the scalar is intensive (does not scale with the number of atoms since temperature is an intensive property).extvector = 1: each vector component is extensive (since kinetic energy scales with the number of atoms).tempflag = 1: marks this compute as providing a temperature, which is used by thermostat fixes to find a compatible temperature compute.
The vector array is allocated in the constructor; the base class
Compute provides a pointer vector that derived classes should
point to their allocated storage.
The destructor frees the vector array if copymode is false. The
copymode flag is set to 0 by default and set to 1 by the
ComputeTempKokkos class. This class is derived from ComputeTemp
and replaces the storage assigned to vector with a Kokkos specific
storage object to facilitate convenient data transfer between host and
accelerator devices:
ComputeTemp::~ComputeTemp()
{
if (!copymode) delete[] vector;
}
The init() method (required)
The init() method must be overridden in every compute style; it is
pure virtual in the base class. It is called once before each run or
minimization to perform one-time initialization (e.g., checking that all
required fixes or computes exist, requesting neighbor lists, etc.).
Here, init() has nothing to do and is given an empty body inline in
the header.
The setup() method (optional)
The optional setup() method is called at the start of each run,
after init(). For ComputeTemp, it computes the number of degrees
of freedom:
void ComputeTemp::setup()
{
dynamic = 0;
if (dynamic_user || group->dynamic[igroup]) dynamic = 1;
dof_compute();
}
The dynamic flag controls whether the number of degrees of freedom
is re-computed every timestep (needed when atoms enter or leave the
group or the simulation).
The compute_scalar() method (optional)
The compute_scalar() method computes the temperature and returns it
as a double:
double ComputeTemp::compute_scalar()
{
invoked_scalar = update->ntimestep;
double **v = atom->v;
double *mass = atom->mass;
double *rmass = atom->rmass;
int *type = atom->type;
int *mask = atom->mask;
int nlocal = atom->nlocal;
double t = 0.0;
if (rmass) {
for (int i = 0; i < nlocal; i++)
if (mask[i] & groupbit)
t += (v[i][0] * v[i][0] + v[i][1] * v[i][1] + v[i][2] * v[i][2]) * rmass[i];
} else {
for (int i = 0; i < nlocal; i++)
if (mask[i] & groupbit)
t += (v[i][0] * v[i][0] + v[i][1] * v[i][1] + v[i][2] * v[i][2]) * mass[type[i]];
}
MPI_Allreduce(&t, &scalar, 1, MPI_DOUBLE, MPI_SUM, world);
if (dynamic) dof_compute();
if (dof < 0.0 && natoms_temp > 0.0)
error->all(FLERR, "Temperature compute degrees of freedom < 0");
scalar *= tfactor;
return scalar;
}
There are several important points:
invoked_scalaris set to the current timestep number to allow other computes and fixes to check whether the result is fresh and avoid redundant re-computation.The group membership is checked via
mask[i] & groupbit. Thegroupbitvariable is defined in the base classComputeand corresponds to the group specified in the compute command.Atom data may be distributed across MPI ranks. Only local atoms (
i < atom->nlocal) are iterated.MPI_Allreduceis used to sum the result across all ranks.The result is stored in
scalar(declared in the base class) and also returned.The branch on
rmasshandles the case where atoms have individual masses (e.g. for granular simulations) versus per-type masses.
The compute_vector() method (optional)
The compute_vector() method computes the kinetic energy tensor and
stores it in the pre-allocated vector array:
void ComputeTemp::compute_vector()
{
invoked_vector = update->ntimestep;
double **v = atom->v;
double *mass = atom->mass;
double *rmass = atom->rmass;
int *type = atom->type;
int *mask = atom->mask;
int nlocal = atom->nlocal;
double massone, t[6];
for (int i = 0; i < 6; i++) t[i] = 0.0;
for (int i = 0; i < nlocal; i++)
if (mask[i] & groupbit) {
if (rmass)
massone = rmass[i];
else
massone = mass[type[i]];
t[0] += massone * v[i][0] * v[i][0];
t[1] += massone * v[i][1] * v[i][1];
t[2] += massone * v[i][2] * v[i][2];
t[3] += massone * v[i][0] * v[i][1];
t[4] += massone * v[i][0] * v[i][2];
t[5] += massone * v[i][1] * v[i][2];
}
MPI_Allreduce(t, vector, 6, MPI_DOUBLE, MPI_SUM, world);
for (int i = 0; i < 6; i++) vector[i] *= force->mvv2e;
}
The result must be stored in the vector array (declared in the base
class) and invoked_vector must be set to mark the result as
fresh.
4.8.15. Case 2: A per-atom compute
In this section we describe a compute that produces a per-atom quantity.
As an example we use compute ke/atom, which
computes the kinetic energy of each atom. The full implementation can be
found in src/compute_ke_atom.cpp and src/compute_ke_atom.h.
Header file
The registration block uses ComputeStyle(ke/atom,ComputeKEAtom).
The class definition:
class ComputeKEAtom : public Compute {
public:
ComputeKEAtom(class LAMMPS *, int, char **);
~ComputeKEAtom() override;
void init() override;
void compute_peratom() override;
double memory_usage() override;
private:
int nmax;
double *ke;
};
The compute overrides init(), compute_peratom(), and
memory_usage(). The per-atom data array ke and its current
allocated size nmax are private members.
Constructor
ComputeKEAtom::ComputeKEAtom(LAMMPS *lmp, int narg, char **arg) :
Compute(lmp, narg, arg), ke(nullptr)
{
if (narg != 3) error->all(FLERR, "Illegal compute ke/atom command");
peratom_flag = 1;
size_peratom_cols = 0;
nmax = 0;
}
The flags have the following meanings:
peratom_flag = 1: marks this compute as providing per-atom data viacompute_peratom().size_peratom_cols = 0: the per-atom data is a vector (one value per atom); set to a non-zero value for arrays with multiple values per atom.
The destructor must free ke:
ComputeKEAtom::~ComputeKEAtom()
{
memory->destroy(ke);
}
The init() method (required)
void ComputeKEAtom::init()
{
if (modify->get_compute_by_style(style).size() > 1)
if (comm->me == 0) error->warning(FLERR, "More than one compute {}", style);
}
This init() implementation emits a warning if more than one compute
of this style exists, since having duplicates wastes computation. Most
compute styles implement at least a minimal init() that checks for
required fixes or computes and requests neighbor lists when needed.
The compute_peratom() method (optional)
void ComputeKEAtom::compute_peratom()
{
invoked_peratom = update->ntimestep;
// grow ke array if necessary
if (atom->nmax > nmax) {
memory->destroy(ke);
nmax = atom->nmax;
memory->create(ke, nmax, "ke/atom:ke");
vector_atom = ke;
}
// compute kinetic energy for each atom in group
double mvv2e = force->mvv2e;
double **v = atom->v;
double *mass = atom->mass;
double *rmass = atom->rmass;
int *mask = atom->mask;
int *type = atom->type;
int nlocal = atom->nlocal;
if (rmass)
for (int i = 0; i < nlocal; i++) {
if (mask[i] & groupbit) {
ke[i] = 0.5 * mvv2e * rmass[i] *
(v[i][0] * v[i][0] + v[i][1] * v[i][1] + v[i][2] * v[i][2]);
} else
ke[i] = 0.0;
}
else
for (int i = 0; i < nlocal; i++) {
if (mask[i] & groupbit) {
ke[i] = 0.5 * mvv2e * mass[type[i]] *
(v[i][0] * v[i][0] + v[i][1] * v[i][1] + v[i][2] * v[i][2]);
} else
ke[i] = 0.0;
}
}
There are several important points:
invoked_peratomis set to the current timestep to mark the result as fresh.The
kearray is grown if the number of owned+ghost atoms (atom->nmax) has increased since the last allocation. The base class pointervector_atommust be updated wheneverkeis reallocated. Settingvector_atom = keafter each reallocation ensures that callers always see the current storage.Atoms not in the group have their per-atom value set to 0.0.
Only local atoms (
i < nlocal) are processed; ghost atoms are not included.
The memory_usage() method (optional)
The memory_usage() method returns an estimate of the memory usage
in bytes. This is used by the info command and for
diagnostic output:
double ComputeKEAtom::memory_usage()
{
double bytes = (double) nmax * sizeof(double);
return bytes;
}
4.8.16. Notes on flags and output types
The Compute base class provides several flags and member variables
that control how the compute’s output is used elsewhere in LAMMPS.
The most important ones for new compute styles are summarized below.
Flag |
Meaning |
|---|---|
|
1 if compute_scalar() is implemented |
|
1 if compute_vector() is implemented |
|
1 if compute_array() is implemented |
|
1 if compute_peratom() is implemented |
|
1 if compute_local() is implemented |
|
length of the global vector |
|
number of rows of the global array |
|
number of columns of the global array |
|
number of columns in the per-atom array (0=vector) |
|
0 if scalar is intensive, 1 if extensive |
|
0 if each vector component is intensive, 1 extensive |
|
0 if each array column is intensive, 1 extensive |
|
1 if compute produces a temperature |
|
1 if compute produces a pressure |
|
pointer to the global vector output array |
|
pointer to the global 2D array output |
|
pointer to the per-atom vector output array |
|
pointer to the per-atom 2D array output |
For the output arrays (vector, array, vector_atom,
array_atom), the derived class is responsible for allocating storage
and pointing the base class pointer to it. The scalar variable is
declared in the base class and need not be separately allocated.
The invoked_scalar, invoked_vector, invoked_array, and
invoked_peratom variables should be set to update->ntimestep
at the start of the corresponding compute method to allow other
routines to detect whether the output is current.