Main Content

dlode45

Deep learning solution of nonstiff ordinary differential equation (ODE)

Since R2021b

    Description

    The neural ordinary differential equation (ODE) operation returns the solution of a specified ODE.

    The dlode45 function applies the neural ODE operation to dlarray data. Using dlarray objects makes working with high dimensional data easier by allowing you to label the dimensions. For example, you can label which dimensions correspond to spatial, time, channel, and batch dimensions using the "S", "T", "C", and "B" labels, respectively. For unspecified and other dimensions, use the "U" label. For dlarray object functions that operate over particular dimensions, you can specify the dimension labels by formatting the dlarray object directly, or by using the DataFormat option.

    Note

    The dlode45 function best suits neural ODE and custom training loop workflows. To solve ODEs for other workflows, use ode45.

    example

    Y = dlode45(net,tspan,Y0) integrates the system of ODEs characterized by the dlnetwork object net on the time interval defined by the first and last elements of tspan and with the initial conditions Y0. (since R2023a)

    example

    Y = dlode45(odefun,tspan,Y0,theta) integrates the system of ODEs given by the function handle odefun on the time interval defined by the first and last elements of tspan, with the initial conditions Y0 and parameters theta.

    Y = dlode45(___,DataFormat=FMT) specifies the data format for the unformatted initial conditions Y0. The format must contain "S" (spatial), "C" (channel), and "B" (batch) dimension labels only.

    Y = dlode45(___,Name=Value) specifies additional options using one or more name-value arguments. For example, Y = dlode45(odefun,tspan,Y0,theta,GradientMode="adjoint") integrates the system of ODEs given by odefun and computes gradients by solving the associated adjoint ODE system.

    Examples

    collapse all

    For the initial conditions, create a formatted dlarray object containing 128 observations of numeric feature data with 4 channels. Specify the format "CB" (channel, batch).

    miniBatchSize = 128;
    numChannels = 4;
    Y0 = rand(numChannels,miniBatchSize);
    Y0 = dlarray(Y0,"CB");

    View the size and format of the initial conditions.

    size(Y0)
    ans = 1×2
    
         4   128
    
    
    dims(Y0)
    ans = 
    'CB'
    

    Create a dlnetwork object that characterizes the ODE function given by two fully connect operations with a tanh activation between them.

    layers = [
        featureInputLayer(4)
        fullyConnectedLayer(10)
        tanhLayer
        fullyConnectedLayer(4)];
    net = dlnetwork(layers);

    Specify an interval of integration of [0 0.1].

    tspan = [0 0.1];

    Apply the neural ODE operation.

    Y = dlode45(net,tspan,Y0);

    View the size and format of the output.

    size(Y)
    ans = 1×2
    
         4   128
    
    
    dims(Y)
    ans = 
    'CB'
    

    For the initial conditions, create a formatted dlarray object containing a batch of 128 28-by-28 images with 64 channels. Specify the format "SSCB" (spatial, spatial, channel, batch).

    miniBatchSize = 128;
    inputSize = [28 28];
    numChannels = 64;
    Y0 = rand(inputSize(1),inputSize(2),numChannels,miniBatchSize);
    Y0 = dlarray(Y0,"SSCB");

    View the size and format of the initial conditions.

    size(Y0)
    ans = 1×4
    
        28    28    64   128
    
    
    dims(Y0)
    ans = 
    'SSCB'
    

    Specify the ODE function. Define the function odeModel, listed in the ODE Function section of the example, which applies a convolution operation followed by a hyperbolic tangent operation to the input data.

    odefun = @odeModel;

    Initialize the parameters for the convolution operation in the ODE function. The output size of the ODE function must match the size of the initial conditions, so specify the same number of filters as the number of input channels.

    filterSize = [3 3];
    numFilters = numChannels;
    
    parameters = struct;
    parameters.Weights = dlarray(rand(filterSize(1),filterSize(2),numChannels,numFilters));
    parameters.Bias = dlarray(zeros(1,numFilters));

    Specify an interval of integration of [0 0.1].

    tspan = [0 0.1];

    Apply the neural ODE operation.

    Y = dlode45(odefun,tspan,Y0,parameters);

    View the size and format of the output.

    size(Y)
    ans = 1×4
    
        28    28    64   128
    
    
    dims(Y)
    ans = 
    'SSCB'
    

    ODE Function

    The ODE function odeModel takes as input the function inputs t (unused) and y, and the ODE function parameters p containing the convolution weights and biases, and returns the output of the convolution-tanh block operation. The convolution operation applies padding such that the output size matches the input size.

    function z = odeModel(t,y,p)
    
    weights = p.Weights;
    bias = p.Bias;
    z = dlconv(y,weights,bias,Padding="same");
    z = tanh(z);
    
    end

    Input Arguments

    collapse all

    Since R2023a

    Neural network characterizing neural ODE function, specified as an initialized dlnetwork object.

    If net has one input, then predict(net,Y) defines the ODE system. If net has two inputs, then predict(net,T,Y) defines the ODE system, where T is a time step repeated over the batch dimension.

    When the first argument is a dlnetwork object and GradientMode is "adjoint", the network State property must be empty. To use a network with a nonempty State property, set GradientMode to "direct".

    Function to solve, specified as a function handle that defines the function to integrate.

    Specify odefun as a function handle with syntax z = fcn(t,y,p), where t is a scalar, y is a dlarray, and p is a set of parameters. The function returns a dlarray with the same size and format as y. The function must accept all three input arguments t, y, and p, even if not all the arguments are used in the function. The size of the ODE function output z must match the size of the initial conditions.

    For example, specify the ODE function that applies a convolution operation followed by a tanh operation.

    function z = dlconvtanh(t,y,p)
    
    weights = p.Weights;
    bias = p.Bias;
    z = dlconv(y,weights,bias,Padding="same");
    z = tanh(z);
    
    end
    Note here that the t argument is unused.

    Data Types: function_handle

    Interval of integration, specified as a numeric vector or an unformatted dlarray vector with two or more elements. The elements in tspan must be all increasing or all decreasing.

    The solver imposes the initial conditions given by Y0 at the initial time tspan(1), then integrates the ODE function from tspan(1) to tspan(end).

    • If tspan has two elements, [t0 tf], then the solver returns the solution evaluated at point tf.

    • If tspan has more than two elements, [t0 t1 t2 ... tf], then the solver returns the solution evaluated at the given points [t1 t2 ... tf]. The solver does not step precisely to each point specified in tspan. Instead, the solver uses its own internal steps to compute the solution, then evaluates the solution at the points specified in tspan. The solutions produced at the specified points are of the same order of accuracy as the solutions computed at each internal step.

      Specifying several intermediate points has little effect on the efficiency of computation, but for large systems it can affect memory management.

    Note

    The behavior of the dlode45 function differs from the ode45 function.

    If InitialStep or MaxStep is [], then the software uses the values of tspan to initialize the values.

    • If InitialStep is [], then the software uses the elements of tspan as an indication of the scale of the task. When you specify tspan with different numbers of elements, the solution of the solver can change.

    • If MaxStep is [], then the software calculates the maximum step size using the first and last elements of tspan. When you change the initial or final values of tspan, the solution of the solver can change because the solver uses a different step sequence.

    Initial conditions, specified as a formatted or unformatted dlarray object.

    If Y0 is an unformatted dlarray, then you must specify the format using the DataFormat option.

    For neural ODE operations, the data format must contain "S", "C", and "B" dimension labels only. The initial conditions must not have a "T" or "U" dimension.

    Parameters of ODE function, specified as one of these values:

    • dlnetwork object (since R2023a)

    • dlarray object

    • Cell array of one or more of these values:

      • dlnetwork object (since R2023a)

      • dlarray object

    • Structure or nested structures of one or more of these values:

      • dlnetwork object (since R2023a)

      • dlarray object

    • Table with the variables Layer, Parameter, and Value, where Layer and Parameter contain the layer and parameter names, and Value contains the parameter value. Specify the variables as dlarray objects.

    Name-Value Arguments

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Example: Y = dlode45(odefun,tspan,Y0,theta,GradientMode="adjoint") integrates the system of ODEs given by odefun and computes gradients by solving the associated adjoint ODE system.

    Dimension order of unformatted input data, specified as a character vector or string scalar FMT that provides a label for each dimension of the data.

    When you specify the format of a dlarray object, each character provides a label for each dimension of the data and must be one of these options:

    • "S" — Spatial

    • "C" — Channel

    • "B" — Batch (for example, samples and observations)

    • "T" — Time (for example, time steps of sequences)

    • "U" — Unspecified

    For neural ODE operations, the data format must contain "S", "C", and "B" dimension labels only. The initial conditions must not have a "T" or "U" dimension.

    You must specify DataFormat when the Y0 is not a formatted dlarray.

    Example: DataFormat="SSCB"

    Data Types: char | string

    Method to compute gradients with respect to the initial conditions and parameters when using the dlgradient function, specified as one of the following:

    • "direct" – Compute gradients by backpropagating through the operations undertaken by the numerical solver. This option best suits large mini-batch sizes or when tspan contains many values.

    • "adjoint" – Compute gradients by solving the associated adjoint ODE system. This option best suits small mini-batch sizes or when tspan contains a small number of values.

    When the first argument is a dlnetwork object and GradientMode is "adjoint", the network State property must be empty. To use a network with a nonempty State property, set GradientMode to "direct".

    The dlaccelerate function does not support accelerating the dlode45 function when the GradientMode option is "direct". To accelerate the code that calls the dlode45 function, set the GradientMode option to "adjoint" or accelerate parts of your code that do not call the dlode45 function with the GradientMode option set to "direct".

    Warning

    When GradientMode is "adjoint", odefun must support function acceleration. Otherwise, the function can return unexpected results.

    When GradientMode is "adjoint", the software traces the ODE function input to determine the computation graph used for automatic differentiation. This tracing process can take some time and can end up recomputing the same trace. By optimizing, caching, and reusing the traces, the software can speed up the gradient computation.

    Because of the nature of caching traces, not all functions support acceleration.

    The caching process can cache values that you might expect to change or that depend on external factors. You must take care when accelerating functions that:

    • have inputs with random or frequently changing values

    • have outputs with frequently changing values

    • generate random numbers

    • use if statements and while loops with conditions that depend on the values of dlarray objects

    • have inputs that are handles or that depend on handles

    • Read data from external sources (for example, by using a datastore or a minibatchqueue object)

    For more information on deep learning function acceleration, see Deep Learning Function Acceleration for Custom Training Loops.

    Initial step size, specified as a positive scalar or [].

    If InitialStepSize is [], then the function automatically determines the initial step size based on the interval of integration and the output of the ODE function corresponding to the initial conditions.

    Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

    Maximum step size, specified as a positive scalar or [].

    If MaxStepSize is [], then the function uses a tenth of the interval of integration size.

    Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

    Relative error tolerance, specified as a positive scalar. The relative tolerance applies to all components of the solution.

    Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

    Absolute error tolerance, specified as a positive scalar. The relative tolerance applies to all components of the solution.

    Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

    Output Arguments

    collapse all

    Solution of the neural ODE at the times given by tspan(2:end), returned as a dlarray object with the same underlying data type as Y0.

    If Y0 is a formatted dlarray and tspan contains exactly two elements, then Y has the same format as Y0. If Y0 is not a formatted dlarray and tspan contains exactly two elements, then Y is an unformatted dlarray with the same dimension order as Y0.

    If Y0 is a formatted dlarray and tspan contains more than two elements, then Y has the same format as Y0 with an additional appended "T" (time) dimension. If Y0 is not a formatted dlarray and tspan contains more than two elements, then Y is an unformatted dlarray with the same dimension order as Y0 with an additional appended dimension corresponding to time.

    Algorithms

    The neural ordinary differential equation (ODE) operation returns the solution of a specified ODE. In particular, given an input, a neural ODE operation outputs the numerical solution of the ODE y=f(t,y,θ) for the time horizon (t0,t1) and with the initial condition y(t0) = y0, where t and y denote the ODE function inputs and θ is a set of learnable parameters. Typically, the initial condition y0 is either the network input or the output of another deep learning operation.

    The dlode45 function uses the ode45 function, which is based on an explicit Runge-Kutta (4,5) formula, the Dormand-Prince pair. It is a single-step solver–in computing y(tn), it needs only the solution at the immediately preceding time point, y(tn-1) [2] [3].

    References

    [1] Chen, Ricky T. Q., Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. “Neural Ordinary Differential Equations.” Preprint, submitted June 19, 2018. https://arxiv.org/abs/1806.07366.

    [2] Dormand, J. R., and P. J. Prince. “A Family of Embedded Runge-Kutta Formulae.” Journal of Computational and Applied Mathematics 6, no. 1 (March 1980): 19–26. https://doi.org/10.1016/0771-050X(80)90013-3.

    [3] Shampine, Lawrence F., and Mark W. Reichelt. “The MATLAB ODE Suite.” SIAM Journal on Scientific Computing 18, no. 1 (January 1997): 1–22. https://doi.org/10.1137/S1064827594276424.

    Version History

    Introduced in R2021b

    expand all