nonlinear spline fit with unknown upper bound of interpolation domain
8 次查看(过去 30 天)
显示 更早的评论
SA-W
2025-2-19
I want to fit the interpolation values of a spline, but I can not provide a good guess for the upper bound of the interpolation domain. That is, x(end) is so to speek an unknown too.
The value of the objective function is a pde residual and to compute it, the spline needs to be evaluated at certain points which can change during iterations. In the toy code below, I mock this by shrinking the interpolation domain over the iterations. For example, when fmincon does the last iteration, the spline was only evaluated at points xq in [0, 6] and the initial guess for the interpolation domain (x = linspace(3, 10, 5)) is not good anymore because the interpolation values defined at x > 6 have not been activated in the forward solve. As a consequence, no meaningful parameters are assigned at points x > 6.
So far, I did a multi-stage optimization: Optimizing with fixed points x, checking the final evaluation points, refining the points, optimizing again, etc. But this is very costly.
My ideas are the following:
- Changing the interpolation points in some iterations (if required): At iteration k, the sampled points xq are in [0, 5]. So the initial points (x = linspace(3, 10, 5)) are changed to x = linspace(3, 5, 5)) in the next iteration. With this strategy I am keeping the number of parameters constant (there is probably no chance to dynamically change the number of parameters for any solver?) I am not sure if this violates differentiability assumptions of the solvers.
- Since I really know x(end) roughly only, I may treat it as an unknown too. So the new parameter vector represents [y; x(end)]. However, in this strategy I want to guide the optimizer (via a penalty or so) in a way that it moves the current x(end) to the current active region. What I can do for instance is to compute the min/max of the evaluation points in every iteration. But I am not sure how to translate this into a penalty term since the min/max evaluation points are not constants.
Can you think of smarter ways to handle this? Maybe spline fit with unknown domain is a known problem.
% Define initial parameters
x = linspace(3, 10, 5); % Interpolation points
y0 = rand(size(x)); % Initial interpolation values
% Define optimization problem
options = optimoptions('fmincon', 'OutputFcn', @output_function); % Track iteration count
% Call fmincon optimizer
global iter_count;
iter_count = 0; % Iteration counter
[y_opt, fval] = fmincon(@objective_func, y0, [], [], [], [], [], [], [], options);
% Display results
fprintf('Optimized objective function value: %.4f\n', fval);
fprintf('Optimized spline values: [%s]\n', num2str(y_opt, '%.2f '));
function obj_val = objective_func(y)
global iter_count;
% Re-compute interpolation points
x = linspace(3, 10, numel(y));
% Create spline f
f = spapi(4, x, y);
% Mock shrinking of evaluation domain
shrink_factor = 1 - exp(-0.1 * iter_count); % Exponential decay
shrinked_domain = 5 + (10 - 5) * (1 - shrink_factor); % Starts at 10, slowly shrinks to 5
xq = linspace(3, shrinked_domain, 10); % PDE evaluation points
% Evaluate the spline at xq
spline_values = fnval(f, xq);
% Compute mock PDE residual (sum of squared differences)
obj_val = sum((spline_values - mean(spline_values)).^2);
% Debug print
% fprintf('Iter %d - Eval points: [%s]\n', iter_count, num2str(xq, '%.2f '));
end
function stop = output_function(~, ~, state)
global iter_count;
if strcmp(state, 'iter')
iter_count = iter_count + 1; % Update iteration count
end
stop = false; % Continue optimization
end
6 个评论
Catalytic
2025-2-19
编辑:Catalytic
2025-2-19
It is doubtful to me that your example is sufficient to convey the essentials of your problem. As far as I can tell, both x and xq are evolving in the course of the iterations. But if that is true, it is required that they both be in some way a function of the unknowns y. Your example doesn't show us this dependence.
If you are contemplating having x(end), xq, and y all be treated as independent unknowns, then that would not make sense, since the solutions become horribly ambiguous. For example, consider -
y=linspace(0,1);
x=linspace(0,1); %x(end)=1
xq=x;
spline_values=interp1(x,y,xq,'spline');
I can get the same spline_values if I were to choose instead -
y=linspace(0,1);
x=linspace(0,100); %x(end)=100
xq=x;
There may be a way to regularize this to get rid of the ambiguity, and maybe that's what you are seeking. But if so, an understanding of how to regularize will not come from showing us your objective function and optimization set-up. It comes from looking at the underlying modeling and physics.
SA-W
2025-2-19
编辑:SA-W
2025-2-19
As far as I can tell, both x and xq are evolving in the course of the iterations.
The interpolation points (x) are not evolving in my example.
it is required that they both be in some way a function of the unknowns y. Your example doesn't show us this dependence.
xq changes indeed across iterations, but there is no chance for me to come up with a dependency xq = xq(y) -- the pde solver controls at which points it evaluates the spline. The "unknown" evaluation points xq is what I wanted to mock with the shrinked domain above. But xq are definitely not unknows parameters that I want to have optimized by fmincon.
Torsten
2025-2-19
Can't you transform the PDE to a fixed domain by rewriting it in the dimensionless spatial coordinate x' = x/L(t) where L(t) is the time-dependent length ? This would keep x' constant as [0,1].
SA-W
2025-2-19
@Torsten The domain on which I solve the pde (\Omega) is independent of the spline domain (x). Conceptually, the pde implicitly solves for data u,
r(u; f(y; xq)) = 0 --> u
where f denotes the spline, y the spline interpolation values and xq are the points at which f was being evaluated at r(u) = 0. But I doubt I can come up with a relation xq = xq(f) (if your comment pointed in that direction...)
Matt J
2025-2-19
the spline was only evaluated at points xq in [0, 6] and the initial guess for the interpolation domain (x = linspace(3, 10, 5)) is not good anymore..because the interpolation values defined at x > 6 have not been activated in the forward solve
What does "good" mean in this context? Why do you care what the values for x>6 are if they don't impact your objective function?
SA-W
2025-2-19
What does "good" mean in this context?
That the initial guess of the interpolation domain is not appropriate for the final model.
Why do you care what the values for x>6 are if they don't impact your objective function?
You are right that the objective function is not impacted much by what is happening after x > 6. But the optimizer sets the parameters at x > 6 to values near the upper bound I am passing to the optimizer. The upper bound is way larger than the fitted parameters at x < 6, causing an extremly large curvature at x > 6 which makes the spline unemanable for evaluation beyond x > 6. When I use the trained model after optimizing, I may have to evaluate at x > 6 where the spline interpolates non-sense values.
Ideally, I would crop the spline after x > 6 by discarding the parameters defined at those points, but the thus obtained spline is of course different from the original one.
采纳的回答
Matt J
2025-2-19
编辑:Matt J
2025-2-19
The upper bound is way larger than the fitted parameters at x < 6, causing an extremly large curvature at x > 6 which makes the spline unemanable for evaluation beyond x > 6.
If high curvature is the problem, perhaps put a penalty on curvature,
% Evaluate the spline at xq
spline_values = fnval(f, xq);
curv_values = fnval(fnder(f,2), x );
%Penalized objective
obj_val = norm( spline_values - mean(spline_values) ).^2 + weight*norm(curv_values).^2;
However, I find it strange that you cannot dictate xq (or at least min(xq), max(xq) ) to your PDE solver. I also find it strange that your PDE residual term has no dependence on any measured data.
27 个评论
SA-W
2025-2-19
编辑:SA-W
2025-2-19
I also find it strange that your PDE residual term has no dependence on any measured data.
As said in the comments,
r(u; f(y; xq)) = 0 --> u
the pde solver solves iteratively for "u" using Newton's method and the notation "r(u; f(y; xq))" means that "r" is a function of the spline f which is evaluated at points xq which in turn depend on the unknown "u" and the parameters "y". So of course, the residual is something like ||u - u_data||^2. I did not know how to mock the pde solver in the example, so I only wrote "spline_values = fnval(f, xq);" and neglected r(...).
However, I find it strange that you cannot dictate xq (or at least min(xq, max(xq)) to your PDE solver.
What do you mean by "min(xq, max(xq))"? is it not the same as xq?
Matt J
2025-2-19
编辑:Matt J
2025-2-19
What do you mean by "min(xq, max(xq))"?
Should be min(xq), max(xq). In other words, I find it strange that you cannot control the domain on which the PDE is solved. All of Matlab's ODE solvers, for example have a tspan argument letting you define this.
Note BTW that whatever black-box method the PDE solver uses to choose xq, you need it to at least be a continuous and piecewise differentiable function of y. If xq is chosen randomly, for example, fmincon will not work.
SA-W
2025-2-19
Should be min(xq), max(xq). In other words, I find it strange that you cannot control the domain on which the PDE is solved.
The domain on which I solve the pde (e.g a cube in 3d to solve a heat equation) has nothing to do with the spline domain x. Think of it like this: For the heat equation, the residual depends on a coefficient \kappa and instead of having a constant \kappa, I replace it with a spline function. But the domain of \kappa (1d) and the pde domain (cube, 3d) are two different things. Though, there are probably too many details hidden about the pde solver.
What I mentioned in the body of the question, namely to define [y; x(end)] as the new parameter set, you think its not pertinent? The idea here would be to prescribe the total number of points and the spacing between the points, but leaving x(end) as a free parameter.
Matt J
2025-2-19
编辑:Matt J
2025-2-19
What I mentioned in the body of the question, namely to define [y; x(end)] as the new parameter set, you think its not pertinent?
I'm not sure that's going to help. Based on your earlier comments, the problem sounds more as though you need to force the optimizer to explore a wider range of xq, because later, once y is determined, you plan to use larger xq than what the optimizer is exploring now. But there is no influence that x(end) has on xq as far as the you've shown us.
The bottom line is, you need to design the cost function so that when fmincon explores y values that you consider bad, the objective value grows larger. A penalty on y, along the lines of what I proposed, might do that. Similar to what @Catalytic commented, though, I don't understand the underlying modeling and physics well enough to recommend anything precise.
SA-W
2025-2-19
I'm not sure that's going to help.
I might have to explore myself whether this makes sense or not.
For the gradient, I need the derivative df/dx(end). For df/dy, we could use
f = spapi(4,x,y);
dfdy = spapi(4, x, eye(numel(x))
because of linearity. I think f is also linear w.r.t x(end), but I am not sure how to compute df/dx(end) in this case.
Matt J
2025-2-19
You also have to somehow compute dxq/dy since there is some black-box dependence of xq on y. I think finite differencing will be the only way.
SA-W
2025-2-19
编辑:SA-W
2025-2-19
You also have to somehow compute dxq/dy since there is some black-box dependence of xq on y.
I think I do not need this. So far, I minimized ||u(y) - u_data||^2 where u is the pde solver outcome and I have an expression for du/dy which depends on df/dy. My gradients are correct, no need for dxq/dy.
Now I would minimize ||u(y;x(end)) - u_data||^2 and have to provide [du/dy; du/dx(end)] where du/dx(end) depends on df/dx(end).
You are not aware of an analytical way for df/dx(end) only?
Matt J
2025-2-19
No, but it's a fairly easy finite difference calculation, e.g.,
x=[0,sort(rand(1,5))];
xq=rand*max(x);
y=rand(1,numel(x));
f=spapi(4,x,y);
delta=1e-12;
f0=f;
xdelta=x; xdelta(end)=xdelta(end)+delta;
fdelta=spapi(4,xdelta,y);
%Finite idfference calculation
findiff =(fnval(fdelta,xq)-fnval(f0,xq))./delta
findiff = -28.0811
SA-W
2025-2-19
Got it. If I wanted to add a penalty term, e.g.,
x_active = max(xq);
obj_val = data_error + weight * ||x(end) - x_active||^2
then I can guide the optimizer to move x(end) towards the current x_active. But to do so, I would have to provide a derivative dxq / dy, right?
SA-W
2025-2-20
Pondering about it longer, I might be able to compute dxq / dy.
Based on my experience, quantile(xq, 0.95) would be a good value for x(end). Designing the loss
x_active = quantile(xq, 0.95);
obj_val = data_error + weight * ||x(end) - x_active||^2
the derivative "d(quantile(xq, 0.95))/dxq" is needed which is piecewise constant only.
Do you have a suggestion how to create a smoothed quantile approximation to make the loss term continuously differentiable?
SA-W
2025-2-20
编辑:SA-W
2025-2-20
Exactly what I was looking for. I can then compute dx_active / dx as follows:
[x_active, ~] = ksdensity(xq, 0.95, 'Function', 'icdf')
[f, ~] = ksdensity(xq, x_active, 'Function', 'pdf')
dx_active_dx = 1 / f
Once I have "dx_active_dx", I need to multiply it by "dx/dy" to differentiate the loss term.
The next problem I see is that "x", i.e the points at which the spline is evaluated, are not an explicit function of the parameters "y" but of the pde solution "u". Sloppy speaking, x = tr(A(u)) where A(u) is a matrix obtained from "u". This means that I can compute "x" for a given "u" but not without. So when computing the points "x", I am at the same time computing "dx/dy".
I guess we could apply the KDE smoothing to "dx/dy" and evaluate it at x_active. But given that "dx/dy" is not injective (in theory, there may be multiple x producing the same dx/dy), any sort of interpolation should be avoided. Ideally, we can compute "dx_active_dy" in one go without chain rule, thereby only using the already available "dx/dy" at the evaluation points xq. What do I mean by "in one go"?
A kernel density estimate approximates the density by something like

and the derivative involves only known quantities "x" and "dx/dy".
Does that make sense and can it be implemented easily with builtin functions?
Matt J
2025-2-20
I'm afraid I'm not following much of it clearly anymore, firstly because you are using "x" and "xq" interchangeably in your terminology and secondly, because the real mathematical problem, which involves u, is very different from your originally posted code.
However, if you are asking about differentiating p(x), it is again a 1D function of x, so it should again be easy to do with ksdensity() and finite differences, if all else fails.
SA-W
2025-2-20
Makes sense what you say.
I have a more concrete question based on the ksdensity suggestion (let me know if you want me to open a new question for this):
The strategy I want to puruse requires me to write a custom version for pdf/cdf to compute the derivatives analytically. Below, I managed to implement my own version version for the pdf, matching perfectly what ksdensity gives. However, for the cdf, I managed to write only a discrete version "cumsum(f) / sum(f)" and there is also a mismatch with the ksdensity version.
How can I get the analytical version for the cfd? Probably it is as simple as for the pdf.
rng('default') % For reproducibility
xq = [randn(30,1); 5+randn(30,1)];
h = 1.6979;
[f, x_eval] = ksdensity(xq, 'Function', 'pdf', 'Bandwidth', h);
% Custom code for pde
f2 = zeros(size(x_eval));
for i = 1:length(x_eval)
f2(i) = sum(exp(-((xq - x_eval(i)).^2) / (2*h^2)) / (sqrt(2*pi) * h));
end
f2 = f2 / length(xq); % Normalize
% Compare (perfect match)
plot(x_eval, f, 'r', 'LineWidth', 2);
hold on;
plot(x_eval, f2, 'b--', 'LineWidth', 2);
legend('ksdensity for pdf', 'Custom code for pdf');

% CDF comparison (no good match)
[F, x_eval] = ksdensity(xq, 'Function', 'cdf', 'Bandwidth', h);
F2 = cumsum(f) / sum(f);
figure(2);
plot(x_eval, F, 'r', 'LineWidth', 2);
hold on;
plot(x_eval, F2, 'b--', 'LineWidth', 2);
legend('ksdensity for cdf', 'Custom code for cdf');

Matt J
2025-2-20
For a perfect analytical expression, you will have to integrate your Gaussian kernels. You can probably do that accurately with normcdf.
SA-W
2025-2-20
For a perfect analytical expression, you will have to integrate your Gaussian kernels.
Yes, the erf() functon gives perfect match (see below).
With these approximations for the PDF and CDF, I think I just have to put some pieces together. Next, I simply compute
[x_active, ~] = ksdensity(xq, 0.95, 'Function', 'icdf')
Using my terminology in the code below, how can we compute "d(x_active) / dy" assuming that "d(xq) / dy" is known? Based on the implicit function therom, I guess we can start by "d(x_active) / dy = 1 / f2(x_active) * dF2 / dy". But I am not sure how to differentiate through "dF2 / dy"
% PDF at x_active
f2(x_active) = sum(exp(-((xq - x_active).^2) / (2*h^2)) / (sqrt(2*pi) * h));
% CDF at x_active
F2(x_active) = sum(G((x_active - xq) / h)) / length(xq);
rng('default') % For reproducibility
xq = [randn(30,1); 5+randn(30,1)];
h = 1.6979;
[f, x_eval] = ksdensity(xq, 'Function', 'pdf', 'Bandwidth', h);
% Custom code for pde
f2 = zeros(size(x_eval));
for i = 1:length(x_eval)
f2(i) = sum(exp(-((xq - x_eval(i)).^2) / (2*h^2)) / (sqrt(2*pi) * h));
end
f2 = f2 / length(xq); % Normalize
% Compare (perfect match)
plot(x_eval, f, 'r', 'LineWidth', 2);
hold on;
plot(x_eval, f2, 'b--', 'LineWidth', 2);
legend('ksdensity for pdf', 'Custom code for pdf');

% CDF matlab
[F, x_eval] = ksdensity(xq, 'Function', 'cdf', 'Bandwidth', h);
% CDF sustom
G = @(x) 0.5 * (1 + erf(x / sqrt(2))); % CDF of Gaussian kernel
F2 = zeros(size(x_eval));
for i = 1:length(x_eval)
F2(i) = sum(G((x_eval(i) - xq) / h)) / length(xq);
end
% Compare (perfect match)
figure(2);
plot(x_eval, F, 'r', 'LineWidth', 2);
hold on;
plot(x_eval, F2, 'b--', 'LineWidth', 2);
legend('ksdensity for cdf', 'Custom code for cdf');

SA-W
2025-2-21
编辑:SA-W
2025-2-21
No,
[~,x_active] = ksdensity(xq, 0.95, 'Function', 'icdf')
means that "x_active = 0.95". But "x_active" should be the approx. 95% quantile from the distribution. So
[x_active, ~] = ksdensity(xq, 0.95, 'Function', 'icdf') % x_active = F^-1(0.95)
% PDF at x_active
f(x_active) = sum(exp(-((xq - x_active).^2) / (2*h^2)) / (sqrt(2*pi) * h * length(xq)));
% CDF at x_active
G = @(x) 0.5 * (1 + erf(x / sqrt(2))); % CDF of Gaussian kernel
F(x_active) = sum(G((x_active - xq) / h)) / length(xq);
is the correct syntax.
Please correct me if I am wrong, but I think the inverse function theorem allows us to do
d(x_active) / dy = 1 / f(x_active) * dF/dy
For "dF/dy", I am not sure. What I know are the derivatives dxq/dy = [dxq(1)/dy, ..., dxq(m)/dy]
Matt J
2025-2-21
编辑:Matt J
2025-2-21
We have,
F(xact,xq) = 0.95
When I differentiate with respect to y using the chain rule, I obtain,
dF(xact,xq)/dxact *dxact/dy + sum_i dF(xact,xq)/dxq(i) *dxq(i)/dy = 0
Rearranging, this gives
dxact/dy = -sum_i dF(xact,xq)/dxq(i) *dxq(i)/dy / f(xact)
So, if you know dxq/dy as you claim, then I think you can get everything on the right hand side.
SA-W
2025-2-21
Nice. So although the points xq (hence also the densitites PDF and CFD) change across iterations, the loss term ||xact - x(end)||^2 fulfills the requirements of differentiability as required for fmincon, right?
Matt J
2025-2-21
编辑:Matt J
2025-2-21
Well, I don't feel entirely qualified to say, because I know nothing about xq, dxq/dy and where they come from. If you are sure that dx/dy is smooth, then yes. If it is at least piecewise smooth, then it is probably okay as well, although why fmincon tends to work on functions that are only piecewise smooth isn't so well understood theoretically.
SA-W
2025-2-21
Interesting. I am not really sure about the smoothness of dx/dy since I only have it available at discrete points xq: dxq(1)/dy, ..., dxq(m)/dy. Thats also why i wanted to express
dxact/dy = -sum_i dF(xact,xq)/dxq(i) *dxq(i)/dy / f(xact)
solely in terms of dxq(i)/dy.
So given a vector of dxq(i)/dy, are there built-in functions to assess the smoothness of the data?
Matt J
2025-2-21
编辑:Matt J
2025-2-24
If there is anything to do an analytical continuity assessment, it would be in the Symbolic Toolbox, though I am not familiar enough with it to know if you'd find something appropriate there.
You could measure compute derivatives and make some kind of assessment of how rapidly they change...
SA-W
2025-4-6
I posted a follow-up question on the ksdensity calculation:
I appreciated if you could comment on that. Your ideas are always valuable!
SA-W
2025-4-15
编辑:SA-W
2025-4-15
In case you are still following, maybe you can share your thoughts on the following:
bw = std(xq, 1)
[x_active, ~] = ksdensity(xq, 0.95, 'Function', 'icdf', 'Bandwidth', bw)
loss = data_loss + weight * (x_active - y)^2 % y: optimization variable
As we discussed in this chat, I am adding a loss term based on the smoothed inverse CDF obtained from ksdensity, where y is an optimization variable (the last interpolation point of the spline). Then, the constraint f''>=0 needs to be re-imposed as non-linear inequality constraints as the interp domain changes...
I observed that when passing a constant bandwidth to ksdensity, the optimization works very well and robust. But when the sampled values xq are densed, the bandwidth will become small and I used the standard deviation for the bandwidth. Doing so, the optimizer stops at a point where the first-order-optimality value is large (>100). This often indicates a point of non-differentiability.
So when setting the bandwidth to the std, can this theoretically lead to such problems?
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Geometry and Mesh 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!发生错误
由于页面发生更改,无法完成操作。请重新加载页面以查看其更新后的状态。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
- América Latina (Español)
- Canada (English)
- United States (English)
欧洲
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
亚太
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)