nonlinear spline fit with unknown upper bound of interpolation domain

Question

SA-W 2025-2-19

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2174234-nonlinear-spline-fit-with-unknown-upper-bound-of-interpolation-domain

编辑： SA-W 2025-4-15

I want to fit the interpolation values of a spline, but I can not provide a good guess for the upper bound of the interpolation domain. That is, x(end) is so to speek an unknown too.

The value of the objective function is a pde residual and to compute it, the spline needs to be evaluated at certain points which can change during iterations. In the toy code below, I mock this by shrinking the interpolation domain over the iterations. For example, when fmincon does the last iteration, the spline was only evaluated at points xq in [0, 6] and the initial guess for the interpolation domain (x = linspace(3, 10, 5)) is not good anymore because the interpolation values defined at x > 6 have not been activated in the forward solve. As a consequence, no meaningful parameters are assigned at points x > 6.

So far, I did a multi-stage optimization: Optimizing with fixed points x, checking the final evaluation points, refining the points, optimizing again, etc. But this is very costly.

My ideas are the following:

Changing the interpolation points in some iterations (if required): At iteration k, the sampled points xq are in [0, 5]. So the initial points (x = linspace(3, 10, 5)) are changed to x = linspace(3, 5, 5)) in the next iteration. With this strategy I am keeping the number of parameters constant (there is probably no chance to dynamically change the number of parameters for any solver?) I am not sure if this violates differentiability assumptions of the solvers.
Since I really know x(end) roughly only, I may treat it as an unknown too. So the new parameter vector represents [y; x(end)]. However, in this strategy I want to guide the optimizer (via a penalty or so) in a way that it moves the current x(end) to the current active region. What I can do for instance is to compute the min/max of the evaluation points in every iteration. But I am not sure how to translate this into a penalty term since the min/max evaluation points are not constants.

Can you think of smarter ways to handle this? Maybe spline fit with unknown domain is a known problem.

% Define initial parameters
x = linspace(3, 10, 5);  % Interpolation points
y0 = rand(size(x));      % Initial interpolation values
% Define optimization problem
options = optimoptions('fmincon', 'OutputFcn', @output_function); % Track iteration count
% Call fmincon optimizer
global iter_count;
iter_count = 0;  % Iteration counter
[y_opt, fval] = fmincon(@objective_func, y0, [], [], [], [], [], [], [], options);
% Display results
fprintf('Optimized objective function value: %.4f\n', fval);
fprintf('Optimized spline values: [%s]\n', num2str(y_opt, '%.2f '));
function obj_val = objective_func(y)
    global iter_count;
    
    % Re-compute interpolation points
    x = linspace(3, 10, numel(y));  
    
    % Create spline f
    f = spapi(4, x, y);
    
    % Mock shrinking of evaluation domain
    shrink_factor = 1 - exp(-0.1 * iter_count);  % Exponential decay
    shrinked_domain = 5 + (10 - 5) * (1 - shrink_factor);  % Starts at 10, slowly shrinks to 5
    xq = linspace(3, shrinked_domain, 10);  % PDE evaluation points
    
    % Evaluate the spline at xq
    spline_values = fnval(f, xq);
    
    % Compute mock PDE residual (sum of squared differences)
    obj_val = sum((spline_values - mean(spline_values)).^2);
    
    % Debug print
    % fprintf('Iter %d - Eval points: [%s]\n', iter_count, num2str(xq, '%.2f '));
end
function stop = output_function(~, ~, state)
    global iter_count;
    if strcmp(state, 'iter')
        iter_count = iter_count + 1;  % Update iteration count
    end
    stop = false;  % Continue optimization
end

6 个评论
显示 4更早的评论隐藏 4更早的评论

Catalytic 2025-2-19

编辑：Catalytic 2025-2-19

在 MATLAB Online 中打开

It is doubtful to me that your example is sufficient to convey the essentials of your problem. As far as I can tell, both x and xq are evolving in the course of the iterations. But if that is true, it is required that they both be in some way a function of the unknowns y. Your example doesn't show us this dependence.

If you are contemplating having x(end), xq, and y all be treated as independent unknowns, then that would not make sense, since the solutions become horribly ambiguous. For example, consider -

y=linspace(0,1);
x=linspace(0,1); %x(end)=1
xq=x;
spline_values=interp1(x,y,xq,'spline');

I can get the same spline_values if I were to choose instead -

y=linspace(0,1);
x=linspace(0,100); %x(end)=100
xq=x;

There may be a way to regularize this to get rid of the ambiguity, and maybe that's what you are seeking. But if so, an understanding of how to regularize will not come from showing us your objective function and optimization set-up. It comes from looking at the underlying modeling and physics.

Matt J 2025-2-19

the spline was only evaluated at points xq in [0, 6] and the initial guess for the interpolation domain (x = linspace(3, 10, 5)) is not good anymore..because the interpolation values defined at x > 6 have not been activated in the forward solve

What does "good" mean in this context? Why do you care what the values for x>6 are if they don't impact your objective function?

SA-W 2025-2-19

@Matt J

What does "good" mean in this context?

That the initial guess of the interpolation domain is not appropriate for the final model.

Why do you care what the values for x>6 are if they don't impact your objective function?

You are right that the objective function is not impacted much by what is happening after x > 6. But the optimizer sets the parameters at x > 6 to values near the upper bound I am passing to the optimizer. The upper bound is way larger than the fitted parameters at x < 6, causing an extremly large curvature at x > 6 which makes the spline unemanable for evaluation beyond x > 6. When I use the trained model after optimizing, I may have to evaluate at x > 6 where the spline interpolates non-sense values.

Ideally, I would crop the spline after x > 6 by discarding the parameters defined at those points, but the thus obtained spline is of course different from the original one.

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Matt J 2025-2-19

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2174234-nonlinear-spline-fit-with-unknown-upper-bound-of-interpolation-domain#answer_1560167

编辑：Matt J 2025-2-19

在 MATLAB Online 中打开

The upper bound is way larger than the fitted parameters at x < 6, causing an extremly large curvature at x > 6 which makes the spline unemanable for evaluation beyond x > 6.

If high curvature is the problem, perhaps put a penalty on curvature,

  
    % Evaluate the spline at xq
    spline_values = fnval(f, xq);
    curv_values = fnval(fnder(f,2), x );
    
    %Penalized objective
    obj_val = norm( spline_values - mean(spline_values) ).^2  + weight*norm(curv_values).^2;

However, I find it strange that you cannot dictate xq (or at least min(xq), max(xq) ) to your PDE solver. I also find it strange that your PDE residual term has no dependence on any measured data.

27 个评论
显示 25更早的评论隐藏 25更早的评论

SA-W 2025-2-20

编辑：SA-W 2025-2-20

在 MATLAB Online 中打开

Exactly what I was looking for. I can then compute dx_active / dx as follows:

[x_active, ~] = ksdensity(xq, 0.95, 'Function', 'icdf')
[f, ~] = ksdensity(xq, x_active, 'Function', 'pdf')
dx_active_dx = 1 / f

Once I have "dx_active_dx", I need to multiply it by "dx/dy" to differentiate the loss term.

The next problem I see is that "x", i.e the points at which the spline is evaluated, are not an explicit function of the parameters "y" but of the pde solution "u". Sloppy speaking, x = tr(A(u)) where A(u) is a matrix obtained from "u". This means that I can compute "x" for a given "u" but not without. So when computing the points "x", I am at the same time computing "dx/dy".

I guess we could apply the KDE smoothing to "dx/dy" and evaluate it at x_active. But given that "dx/dy" is not injective (in theory, there may be multiple x producing the same dx/dy), any sort of interpolation should be avoided. Ideally, we can compute "dx_active_dy" in one go without chain rule, thereby only using the already available "dx/dy" at the evaluation points xq. What do I mean by "in one go"?

A kernel density estimate approximates the density by something like

and the derivative involves only known quantities "x" and "dx/dy".

Does that make sense and can it be implemented easily with builtin functions?

SA-W 2025-2-20

在 MATLAB Online 中打开

Makes sense what you say.

I have a more concrete question based on the ksdensity suggestion (let me know if you want me to open a new question for this):

The strategy I want to puruse requires me to write a custom version for pdf/cdf to compute the derivatives analytically. Below, I managed to implement my own version version for the pdf, matching perfectly what ksdensity gives. However, for the cdf, I managed to write only a discrete version "cumsum(f) / sum(f)" and there is also a mismatch with the ksdensity version.

How can I get the analytical version for the cfd? Probably it is as simple as for the pdf.

rng('default') % For reproducibility

xq = [randn(30,1); 5+randn(30,1)];

h = 1.6979;

[f, x_eval] = ksdensity(xq, 'Function', 'pdf', 'Bandwidth', h);

% Custom code for pde

f2 = zeros(size(x_eval));

for i = 1:length(x_eval)

f2(i) = sum(exp(-((xq - x_eval(i)).^2) / (2*h^2)) / (sqrt(2*pi) * h));

end

f2 = f2 / length(xq); % Normalize

% Compare (perfect match)

plot(x_eval, f, 'r', 'LineWidth', 2);

hold on;

plot(x_eval, f2, 'b--', 'LineWidth', 2);

legend('ksdensity for pdf', 'Custom code for pdf');

% CDF comparison (no good match)

[F, x_eval] = ksdensity(xq, 'Function', 'cdf', 'Bandwidth', h);

F2 = cumsum(f) / sum(f);

figure(2);

plot(x_eval, F, 'r', 'LineWidth', 2);

hold on;

plot(x_eval, F2, 'b--', 'LineWidth', 2);

legend('ksdensity for cdf', 'Custom code for cdf');

SA-W 2025-2-20

在 MATLAB Online 中打开

For a perfect analytical expression, you will have to integrate your Gaussian kernels.

Yes, the erf() functon gives perfect match (see below).

With these approximations for the PDF and CDF, I think I just have to put some pieces together. Next, I simply compute

[x_active, ~] = ksdensity(xq, 0.95, 'Function', 'icdf')

Using my terminology in the code below, how can we compute "d(x_active) / dy" assuming that "d(xq) / dy" is known? Based on the implicit function therom, I guess we can start by "d(x_active) / dy = 1 / f2(x_active) * dF2 / dy". But I am not sure how to differentiate through "dF2 / dy"

% PDF at x_active
f2(x_active) = sum(exp(-((xq - x_active).^2) / (2*h^2)) / (sqrt(2*pi) * h));
% CDF at x_active
F2(x_active) = sum(G((x_active - xq) / h)) / length(xq);

rng('default') % For reproducibility

xq = [randn(30,1); 5+randn(30,1)];

h = 1.6979;

[f, x_eval] = ksdensity(xq, 'Function', 'pdf', 'Bandwidth', h);

% Custom code for pde

f2 = zeros(size(x_eval));

for i = 1:length(x_eval)

f2(i) = sum(exp(-((xq - x_eval(i)).^2) / (2*h^2)) / (sqrt(2*pi) * h));

end

f2 = f2 / length(xq); % Normalize

% Compare (perfect match)

plot(x_eval, f, 'r', 'LineWidth', 2);

hold on;

plot(x_eval, f2, 'b--', 'LineWidth', 2);

legend('ksdensity for pdf', 'Custom code for pdf');

% CDF matlab

[F, x_eval] = ksdensity(xq, 'Function', 'cdf', 'Bandwidth', h);

% CDF sustom

G = @(x) 0.5 * (1 + erf(x / sqrt(2))); % CDF of Gaussian kernel

F2 = zeros(size(x_eval));

for i = 1:length(x_eval)

F2(i) = sum(G((x_eval(i) - xq) / h)) / length(xq);

end

% Compare (perfect match)

figure(2);

plot(x_eval, F, 'r', 'LineWidth', 2);

hold on;

plot(x_eval, F2, 'b--', 'LineWidth', 2);

legend('ksdensity for cdf', 'Custom code for cdf');

SA-W 2025-4-6

@Matt J

I posted a follow-up question on the ksdensity calculation:

https://de.mathworks.com/matlabcentral/answers/2176025-differentiable-approximation-of-bandwidth-in-ksdensity-for-optimization

I appreciated if you could comment on that. Your ideas are always valuable!

SA-W 2025-4-15

编辑：SA-W 2025-4-15

在 MATLAB Online 中打开

@Matt J

In case you are still following, maybe you can share your thoughts on the following:

bw = std(xq, 1)
[x_active, ~] = ksdensity(xq, 0.95, 'Function', 'icdf', 'Bandwidth', bw)
loss = data_loss + weight * (x_active - y)^2 % y: optimization variable

As we discussed in this chat, I am adding a loss term based on the smoothed inverse CDF obtained from ksdensity, where y is an optimization variable (the last interpolation point of the spline). Then, the constraint f''>=0 needs to be re-imposed as non-linear inequality constraints as the interp domain changes...

I observed that when passing a constant bandwidth to ksdensity, the optimization works very well and robust. But when the sampled values xq are densed, the bandwidth will become small and I used the standard deviation for the bandwidth. Doing so, the optimizer stops at a point where the first-order-optimality value is large (>100). This often indicates a point of non-differentiability.

So when setting the bandwidth to the std, can this theoretically lead to such problems?

请先登录，再进行评论。

nonlinear spline fit with unknown upper bound of interpolation domain

6 个评论
显示 4更早的评论隐藏 4更早的评论

采纳的回答

27 个评论
显示 25更早的评论隐藏 25更早的评论

更多回答（0 个）

另请参阅

类别

标签

Community Treasure Hunt

nonlinear spline fit with unknown upper bound of interpolation domain

6 个评论 显示 4更早的评论隐藏 4更早的评论

采纳的回答

27 个评论 显示 25更早的评论隐藏 25更早的评论

更多回答（0 个）

另请参阅

类别

标签

Community Treasure Hunt

6 个评论
显示 4更早的评论隐藏 4更早的评论

27 个评论
显示 25更早的评论隐藏 25更早的评论