Create and Modify Models¶

Like the previous Section Modeling Framework, this section is useful mostly for users who want to create new models from scratch or customize existing models. Users who only want to run simulations from existing models may skip this section.

As a simple example, we will start here from a model which numerically solves the 1-d advection equation using the Lax method. The equation may be written as:

\[\frac{\partial u}{\partial t} + \nu \frac{\partial u}{\partial x} = 0\]

with \(u(x, t)\) as the quantity of interest and where \(\nu\) is the velocity. The discretized form of this equation may be written as:

\[u^{n+1}_i = \frac{1}{2} (u^n_{i+1} + u^n_{i-1}) - \nu \frac{\Delta t}{2 \Delta x} (u^n_{i+1} - u^n_{i-1})\]

Where \(\Delta x\) is the fixed spacing between the nodes \(i\) of a uniform grid and \(\Delta t\) is the step duration between times \(n\) and \(n+1\).

We could just implement this numerical model with a few lines of Python / Numpy code, e.g., here below assuming periodic boundary conditions and a Gaussian pulse as initial profile. We will show, however, that it is very easy to refactor this code for using it with xarray-simlab. We will also show that, while enabling useful features, the refactoring still results in a short amount of readable code.

import numpy as np


# grid
spacing = 0.01
length = 1.5
x = np.arange(0, length, spacing)

# velocity
v = 1.0

# time
start = 0.0
end = 1.0
step = 0.01

# initial gauss profile
loc = 0.3
scale = 0.1
u = np.exp(-1 / scale ** 2 * (x - loc) ** 2)
u0 = u.copy()

# time loop - Lax method
factor = (v * step) / (2 * spacing)

for t in np.arange(start, end, step):
    u_left = np.roll(u, 1)
    u_right = np.roll(u, -1)
    u1 = 0.5 * (u_right + u_left) - factor * (u_right - u_left)
    u = u1.copy()

Anatomy of a Process subclass¶

Let’s first wrap the code above into a single class named AdvectionLax1D decorated by process. Next we’ll explain in detail the content of this class.

import xsimlab as xs


@xs.process
class AdvectionLax1D:
    """Wrap 1-dimensional advection in a single Process."""

    spacing = xs.variable(description="grid spacing", static=True)
    length = xs.variable(description="grid total length", static=True)
    x = xs.variable(dims="x", intent="out")

    v = xs.variable(dims=[(), "x"], description="velocity")

    loc = xs.variable(description="location of initial profile", static=True)
    scale = xs.variable(description="scale of initial profile", static=True)
    u = xs.variable(
        dims="x", intent="out", description="quantity u", attrs={"units": "m"}
    )

    def initialize(self):
        self.x = np.arange(0, self.length, self.spacing)
        self.u = np.exp(-1 / self.scale ** 2 * (self.x - self.loc) ** 2)

    @xs.runtime(args="step_delta")
    def run_step(self, dt):
        factor = (self.v * dt) / (2 * self.spacing)
        u_left = np.roll(self.u, 1)
        u_right = np.roll(self.u, -1)
        self.u1 = 0.5 * (u_right + u_left) - factor * (u_right - u_left)

    def finalize_step(self):
        self.u = self.u1

Process interface¶

AdvectionLax1D has some class attributes declared at the top, which together form the process’ “public” interface, i.e., all the variables that we want to be publicly exposed by this process. Here we use variable() to add some metadata to each variable of the interface.

We first may specify the labels of the dimensions expected for each variable, which defaults to an empty tuple (i.e., a scalar value is expected). In this example, variables spacing, length, loc and scale are all scalars, whereas x and u are both arrays defined on the 1-dimensional \(x\) grid. Multiple choices can also be given as a list, like variable v which represents a velocity field that can be either constant (scalar) or variable (array) in space.

Additionally, it is also possible to add a short description and/or custom metadata like units with the attrs argument.

Another important argument is intent, which specifies how the process deals with the value of the variable. By default, intent='in' means that the process needs a value set for the variable for its computation ; this value should either be computed elsewhere by another process or be provided by the user as model input. By contrast, variables x and u have intent='out', which means that the process AdvectionLax1D itself initializes and computes a value for these two variables.

Note also static=True set for spacing, length, loc and scale. This is to prevent providing time varying values as model inputs for those parameters. By default, it is possible to change the value of a variable during a simulation (external forcing), see Section Time-varying input values for an example. This is not always desirable, though.

Process “runtime” methods¶

Beside its interface, the process AdvectionLax1D also implements methods that will be called during simulation runtime:

.initialize() will be called once at the beginning of a simulation. Here it is used to set the x-coordinate values of the grid and the initial values of u along the grid (Gaussian pulse).
.run_step() will be called at each time step iteration. This is where the Lax method is implemented.
.finalize_step() will be called at each time step iteration too but after having called run_step for all other processes (if any). Its intended use is mainly to ensure that state variables like u are updated consistently and after having taken snapshots.

A fourth method .finalize() could also be implemented, but it is not needed in this case. This method is called once at the end of the simulation, e.g., for some clean-up.

Each of these methods can be decorated with runtime() to pass some useful information during simulation runtime (e.g., current time step number, current time or time step duration), which may be needed for the computation. Without this decorator, runtime methods must have no other parameter than self.

Getting / setting variable values¶

For each variable declared as class attributes in AdvectionLax1D we can get their value (and/or set a value depending on their intent) elsewhere in the class like if it was defined as regular instance attributes, e.g., using self.u for variable u.

Note

In xarray-simlab it is safe to run multiple simulations concurrently: each simulation has its own process instances.

Beside variables declared in the process interface, nothing prevent us from using regular attributes in process classes if needed. For example, self.u1 is set as a temporary internal state in AdvectionLax1D to wait for the “finalize step” stage before updating \(u\).

Creating a Model instance¶

Creating a new Model instance is very easy. We just need to provide a dictionary with the process class(es) that we want to include in the model, e.g., with only the process created above:

advect_model_raw = xs.Model({"advect": AdvectionLax1D})

That’s it! Now we have different tools already available to inspect the model (see Section Inspect Models). We can also use that model with the xarray extension provided by xarray-simlab to create new setups, run the model, take snapshots for one or more variables on a given frequency, etc. (see Section Setup and Run Models).

Fine-grained process refactoring¶

The model created above isn’t very flexible. What if we want to change the initial conditions? Use a grid with variable spacing? Add another physical process impacting \(u\) such as a source or sink term? In all cases we would need to modify the class AdvectionLax1D.

This framework works best if we instead split the problem into small pieces, i.e., small process classes that we can easily combine and replace in models.

The AdvectionLax1D process may for example be refactored into 4 separate processes:

UniformGrid1D : grid creation
ProfileU : update \(u\) values along the grid at each time iteration
AdvectionLax : perform advection at each time iteration
InitUGauss : create initial \(u\) values along the grid.

UniformGrid1D

This process declares all grid-related variables and computes x-coordinate values.

@xs.process
class UniformGrid1D:
    """Create a 1-dimensional, equally spaced grid."""

    spacing = xs.variable(description="uniform spacing", static=True)
    length = xs.variable(description="total length", static=True)
    x = xs.index(dims="x")

    def initialize(self):
        self.x = np.arange(0, self.length, self.spacing)

All grid variables are static, i.e., their values must be time-invariant. The x variable is declared using index(). This is a specific kind of variable intended for storing coordinate labels, here useful for indexing any data on the grid. x values must be set somewhere in the process runtime methods and they should also be time-invariant (i.e., all index variables imply intent='out' and static=True). Those values are set here once at the beginning of the simulation ; there is no need to implement .run_step().

ProfileU

@xs.process
class ProfileU:
    """Compute the evolution of the profile of quantity `u`."""

    u_vars = xs.group("u_vars")
    u = xs.variable(
        dims="x", intent="inout", description="quantity u", attrs={"units": "m"}
    )

    def run_step(self):
        self._delta_u = sum((v for v in self.u_vars))

    def finalize_step(self):
        self.u += self._delta_u

u_vars is declared as a group() variable, i.e., an iterable of all variables declared elsewhere that belong the same group (‘u_vars’ in this case). In this example, it allows to further add one or more processes that will also affect the evolution of \(u\) in addition to advection (see below).

Note also intent='inout' set for u, which means that ProfileU updates the value of \(u\) but still needs an initial value from elsewhere.

AdvectionLax

@xs.process
class AdvectionLax:
    """Advection using finite difference (Lax method) on
    a fixed grid with periodic boundary conditions.

    """

    v = xs.variable(dims=[(), "x"], description="velocity")
    grid_spacing = xs.foreign(UniformGrid1D, "spacing")
    u = xs.foreign(ProfileU, "u")
    u_advected = xs.variable(dims="x", intent="out", groups="u_vars")

    @xs.runtime(args="step_delta")
    def run_step(self, dt):
        factor = self.v / (2 * self.grid_spacing)

        u_left = np.roll(self.u, 1)
        u_right = np.roll(self.u, -1)
        u_1 = 0.5 * (u_right + u_left) - factor * dt * (u_right - u_left)

        self.u_advected = u_1 - self.u

u_advected represents the effect of advection on the evolution of \(u\) and therefore belongs to the group ‘u_vars’.

Computing values of u_advected requires values of variables spacing and u that are already declared in the UniformGrid1D and ProfileU classes, respectively. Here we declare them as foreign() variables, which allows to handle them like if these were the original variables. For example, self.grid_spacing in this class will return the same value than self.spacing in UniformGrid1D.

InitUGauss

@xs.process
class InitUGauss:
    """Initialize `u` profile using a Gaussian pulse."""

    loc = xs.variable(description="location of initial pulse", static=True)
    scale = xs.variable(description="scale of initial pulse", static=True)
    x = xs.foreign(UniformGrid1D, "x")
    u = xs.foreign(ProfileU, "u", intent="out")

    def initialize(self):
        self.u = np.exp(-1 / self.scale ** 2 * (self.x - self.loc) ** 2)

A foreign variable can also be used to set values for variables that are declared in other processes, as for u here with intent='out'.

Refactored model

We now have all the building blocks to create a more flexible model:

advect_model = xs.Model(
    {
        "grid": UniformGrid1D,
        "profile": ProfileU,
        "init": InitUGauss,
        "advect": AdvectionLax,
    }
)

The order in which the processes are given in the dictionary doesn’t matter. When creating a new instance of Model, the xarray-simlab modeling framework automatically sorts the given processes into a computationally consistent order and retrieves the model inputs among all declared variables in all processes.

In terms of computation and inputs, advect_model is equivalent to the advect_model_raw instance created above ; it is just organized differently.

Update existing models¶

Between the two Model instances created so far, the advantage of advect_model over advect_model_raw is that we can easily update the model – change its behavior and/or add many new features – without sacrificing readability or losing the ability to get back to the original, simple version.

Example: adding a source term at a specific location

For this we create a new process:

@xs.process
class SourcePoint:
    """Source point for quantity `u`.

    The location of the source point is adjusted to coincide with
    the nearest node the grid.

    """

    loc = xs.variable(description="source location")
    flux = xs.variable(description="source flux")
    x = xs.foreign(UniformGrid1D, "x")
    u_source = xs.variable(dims="x", intent="out", groups="u_vars")

    @property
    def nearest_node(self):
        idx = np.abs(self.x - self.loc).argmin()
        return idx

    @property
    def source_rate(self):
        src_array = np.zeros_like(self.x)
        src_array[self.nearest_node] = self.flux
        return src_array

    @xs.runtime(args="step_delta")
    def run_step(self, dt):
        self.u_source = self.source_rate * dt

Some comments about this class:

u_source belongs to the group ‘u_vars’ and therefore will be added to u_advected in ProfileU process.
Methods and/or properties other than the reserved “runtime” methods may be added in a Process subclass, just like in any other Python class.
Nearest node index and source rate array will be recomputed at each time iteration because variables loc and flux may both have a time dimension (variable source location and intensity), i.e., self.loc and self.flux may both change at each time iteration.

In this example we also want to start with a flat, zero \(u\) profile instead of a Gaussian pulse. We create another (minimal) process for that:

@xs.process
class InitUFlat:
    """Flat initial profile of `u`."""

    x = xs.foreign(UniformGrid1D, "x")
    u = xs.foreign(ProfileU, "u", intent="out")

    def initialize(self):
        self.u = np.zeros_like(self.x)

Using one command, we can then update the model with these new features:

advect_model_src = advect_model.update_processes(
    {"source": SourcePoint, "init": InitUFlat}
)

Compared to advect_model, this new advect_model_src have a new process named ‘source’ and a replaced process ‘init’.

Removing one or more processes

It is also possible to create new models by removing one or more processes from existing Model instances, e.g.,

advect_model_no_init = advect_model.drop_processes("init")

In this latter case, users will have to provide initial values of \(u\) along the grid directly as an input array.

Note

Model instances are immutable, i.e., once created it is not possible to modify these instances by adding, updating or removing processes. Both methods .update_processes() and .drop_processes() always return new instances of Model.

Customize existing processes¶

Sometimes we only want to update an existing model with very minor changes.

As an example, let’s update advect_model by using a fixed grid (i.e., with hard-coded values for grid spacing and length). One way to achieve this is to create a small new process class that sets the values of spacing and length:

@xs.process
class FixedGridParams:
    spacing = xs.foreign(UniformGrid1D, "spacing", intent="out")
    length = xs.foreign(UniformGrid1D, "length", intent="out")

    def initialize(self):
        self.spacing = 0.01
        self.length = 1.0

However, one drawback of this “additive” approach is that the number of processes in a model might become unnecessarily high:

advect_model_fgrid = advect_model.update_processes(
    {"fixed_grid_params": FixedGridParams}
)

Alternatively, it is possible to write a process class that inherits from UniformGrid1D, in which we can re-declare variables and/or re-define “runtime” methods:

@xs.process
class FixedGrid(UniformGrid1D):
    spacing = xs.variable(description="uniform spacing", intent="out")
    length = xs.variable(description="total length", intent="out")

    def initialize(self):
        self.spacing = 0.01
        self.length = 1.0
        super(FixedGrid, self).initialize()

We can here directly update the model and replace the original process UniformGrid1D by the inherited class FixedGrid. Foreign variables that refer to UniformGrid1D will still correctly point to the grid process in the updated model:

advect_model_fgrid2 = advect_model.update_processes({"grid": FixedGrid})

Warning

This feature is experimental! It may be removed in a next version of xarray-simlab.

In particular, linking foreign variables in a model is ambiguous when both conditions below are met:

two different processes in a model inherit from a common class (except object)
a third process in the same model has a foreign variable that links to that common class