Failing parameter estimation using fmincon. Objective function is undefined at initial point.

Question

Chuck Noise 2023-7-28

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2001892-failing-parameter-estimation-using-fmincon-objective-function-is-undefined-at-initial-point

评论： Chuck Noise 2023-7-28

Hi

I am trying to make a Matlab script that uses a continuous-discrete extended Kalman filter (CD-EKF) and prediction-error-method (PEM) in order to estimate several of the parameters in a SDE. I spend several days trying to implement the PEM par using fmincon and I am now at a place where I could really use some help. When I run my main script I get the error:

Error using barrier

Objective function is undefined at initial point. Fmincon cannot continue.

I hope someone here can give me a notch in the right direction.

There are probably a number of places where I might have made a mistake, so I'll try to just go through the most of the code and leave some of the functions out. If one of the functions is necessary to see let me know.

First I got my main script:

% True parameters for simulation of CSTR
dH = -560;
rho = 1;
cp = 4.186;
p.V = 0.105;
p.CAin = 1.6/2;
p.CBin = 2.4/2;
p.Tin = 273.65;
p.logk0 = 24.6;
p.EaR = 8500;
p.beta = -dH/(rho*cp);
p.sigma = 5;
p.sigma_v = 0.15;
t0 = 0; % Starting time
tfMinutes = 35; % Final time in minutes
tf = tfMinutes*60; % Final time in seconds
N = 1000; % Steps per minute
Ns = 1; % Number of realizations for the standart Wiener process
nw = 1; % Dimensio for the standart Wiener process (one as the CSTR is a one-state model)
seed = 1401;
rng(seed)
x0 = 273.65; % Initial temperature in Kalvin
t = 0:60:tf; % Time vector
% Flow rate
U = [repmat(0.7/60,1,3*N) repmat(0.6/60,1,2*N) repmat(0.5/60,1,2*N) repmat(0.4/60,1,2*N) ...
    repmat(0.3/60,1,3*N) repmat(0.2/60,1,4*N) repmat(0.3/60,1,2*N) repmat(0.4/60,1,2*N) ...
    repmat(0.5/60,1,2*N) repmat(0.6/60,1,2*N) repmat(0.7/60,1,4*N) repmat(0.2/60,1,4*N) ...
    repmat(0.7/60,1,3*N)];
% 1D Euler-Maruyama (explicit-explicit) fixed step size
[W,T,~]=StdWienerProcess(tf,N*tfMinutes,nw,Ns,seed); % Standart Wiener process to generate white noise
Unrecognized function or variable 'StdWienerProcess'.
X = zeros(1,length(T),Ns);
% Simulation of the one-state CSTR model with true parameters
for i=1:Ns
    X(:,:,i) = SDEsolverExplicitExplicit(...
        @CSTR1dDrift,@CSTRDiffusion,...
        T,x0,W(:,:,i),U,p);
end
% Estimating state and adding measurement noise. Due to the one-state model it is simply y = x + v 
% where v = N~(0,sigma_v^2)
y = state_estimation(1, X, p.sigma_v, seed); 
% Known values of p.V, p.CAin, p.CBin, p.Tin and p.sigma_v which are fixed
% parameters and not to be estimated
fixed_params = [p.V, p.CAin, p.CBin, p.Tin, p.sigma_v];
% Initial guesses for parameters that ought to be estimated, i.e. beta,
% log(k0), EaR and sigma
p_initial_guess = [130, 24, 8300, 7];
% Perform maximum likelihood parameter estimation
theta_estimated = ml_parameter_estimation(y, U, x0, T, p_initial_guess, fixed_params);

For the estimation of the parameters I am trying to use a maximum likelihood estimation by minimizing the negative log-likelihood function:

Where the innovation,

, and covariance,

, are computed from CD-EKF.

The ml_parameter_estimation function is:

function theta_est = ml_parameter_estimation(y, Us, x01d, Ts, p_initial, fixed_params)
    % Set lower and upper bounds for the parameters in theta
    lb = [129, 22, 8000, 3];
    ub = [140, 26, 9000, 8];
    % Define the objective function for maximum likelihood parameter estimation
    objective_function = @(theta) ml_objective_function(theta, y, Us, x01d, Ts, fixed_params);
    % Perform optimization to find the estimated parameter vector theta_est
    options = optimoptions('fmincon', 'Algorithm', 'interior-point', 'Display', 'iter');
    theta_est = fmincon(objective_function, p_initial, [], [], [], [], lb, ub, [], options);
end

The ml_objective_function function is:

function neg_log_likelihood = ml_objective_function(theta, y, Us, x01d, Ts, fixed_params)
    % Extract the parameters from the theta vector
    p.V = fixed_params(1);
    p.CAin = fixed_params(2);
    p.CBin = fixed_params(3);
    p.Tin = fixed_params(4);
    p.sigma_v = fixed_params(5);
    p.Rk = p.sigma_v^2;
    
    % Extract the remaining parameters from the theta vector
    p.beta = theta(1);
    p.logk0 = theta(2);
    p.EaR = theta(3);
    p.sigma = theta(4);
    % Initialize CD-EKF
    [~, ~, E_k, R_E_k] = cd_ekf(@CSTR1dFunJac, y, Us, x01d, 1, Ts, 1, p);
    % Calculate the negative log-likelihood (maximize the log-likelihood)
    N = size(y, 2);
    n_y = size(y, 1);
    neg_log_likelihood = (N + 1) / 2 * n_y * log(2 * pi);
    for k = 1:N
        neg_log_likelihood = neg_log_likelihood + 0.5 * (log(det(R_E_k(k))) + E_k(k)' * (R_E_k(k) \ E_k(k)));
    end
end

The cd_ekf function is:

function [Xhat, P, E_k, R_E_k] = cd_ekf(funJac, y, u, x0, P0, T, C, p)
xhat_k1_k1 = x0; % Initial state estimate
P_k1_k1 = P0; % Initial state covariance
Xhat = zeros(length(y),1);
Xhat(1) = xhat_k1_k1;
P = zeros(length(y),1);
P(1) = P_k1_k1;
E_k = zeros(length(y),1);
R_E_k = zeros(length(y),1);
Rk = p.sigma_v^2;
% Perform prediction and filtering for each time step
for k = 1:length(T)-1
    u_k1 = u(k); % Previous manipulated variable
    y_k = y(k);   % Measurement at current step
    
    % Prediction step
    [xhat_k_k1, P_k_k1] = cd_ekf_prediction(funJac, xhat_k1_k1, P_k1_k1, u_k1, p, T(k), T(k+1));
    
    % Filtering step
    [xhat_k_k, P_k_k, e_k, R_e_k] = cd_ekf_filtering(xhat_k_k1, P_k_k1, y_k, Rk, C);
    
    % Update for the next iteration
    xhat_k1_k1 = xhat_k_k;
    P_k1_k1 = P_k_k;
    
    Xhat(k+1) = xhat_k1_k1;
    P(k+1) = P_k1_k1;
    E_k(k+1) = e_k;
    R_E_k(k+1) = R_e_k;
end

The cd_ekf_prediction function is:

function [xhat_k_k1, P_k_k1] = cd_ekf_prediction(funJac, xhat_k1_k1, P_k1_k1, u_k1, p, t_k1, t_k)
    % Compute the one-step prediction using Explicit Euler method
    N = 10; % Number of steps
    [~, xhat_k_k1] = ExplicitEulerFixed(funJac, [t_k1, t_k], N, xhat_k1_k1, u_k1, p);
    xhat_k_k1 = xhat_k_k1(end);
    
    % Compute the Jacobian A_k1 using the provided function handle
    [~, A_k1] = feval(funJac, t_k1, xhat_k1_k1, u_k1, p);
    
    % Compute the derivative of P_k1_k1
    sigma_k1 = p.sigma;
    dP_k1_k1 = @(t, P) A_k1 * P + P * A_k1' + sigma_k1 * sigma_k1';
    [~, P_k_k1] = ExplicitEulerFixed(dP_k1_k1, [t_k1, t_k], N, P_k1_k1);
    P_k_k1 = P_k_k1(end);
end

And finally the cd_ekf_filtering function is:

function [xhat_k_k, P_k_k, e_k, R_e_k] = cd_ekf_filtering(xhat_k_k1, P_k_k1, y_k, R_k, C_k)
    % Compute the measurement estimate
    yhat_k_k1 = xhat_k_k1;
    
    e_k = y_k - yhat_k_k1;
    R_e_k = R_k + C_k * P_k_k1 * C_k';
    
    % Compute the Kalman gain
    K_k = P_k_k1 * C_k' / R_e_k;
    
    % Update the state estimate and its covariance
    xhat_k_k = xhat_k_k1 + K_k * e_k;
    P_k_k = (eye(size(P_k_k1)) - K_k * C_k) * P_k_k1 * (eye(size(P_k_k1)) - K_k * C_k)' + K_k * R_k * K_k';
end

I know it is a pretty big implementation to digest and I hope that someone has the time and patience to help me. If any other scripts are needed such as the Euler-Maruyama, standart Wiener process etc. are needed let me know.

Thanks