Vectorizing the Spline Function

3 次查看(过去 30 天)
Hayley Rogers
Hayley Rogers 2021-2-3
回答: Vinayak 2024-5-14
I'm trying to make my code more efficient by vectorizing, since I'm running on multiple variables on a 2M+ dataset. For the life of me, I can't figure out how to vectorizie a Matlab spline function. The inputs are vectors, and the output is a structure. I can't figure out how to make it work...
%This function is intended to take a variable from the dataset I'm working with
function [output] = Splines(Var,GroupLineNum)
output = array2table(zeros(height(Var),6));
output.Properties.VariableNames = {'spx' 'pcx' 'mkx' 'spxx' 'pcxx' 'mkxx'};
x = array2table(zeros(height(Var),3));
x.Properties.VariableNames = {'x1' 'x2' 'x3'};
y = array2table(zeros(height(Var),3));
y.Properties.VariableNames = {'y1' 'y2' 'y3'};
dummy_table = table(0);
dummy_table2 = [dummy_table;dummy_table];
Variable = array2table(Var);
a_sym=3:height(Var);
a_var=1:height(Var)-2;
b_sym=2:height(Var);
b_var=1:height(Var)-1;
c=1:height(Var);
if GroupLineNum{c,:}<3
output{c,'spx'} = missing;
output{c,'pcx'} = missing;
output{c,'mkx'} = missing;
output{c,'spxx'} = missing;
output{c,'pcxx'} = missing;
output{c,'mkxx'} = missing;
end
%Spline will have three points, x just set as 1/2/3
x(:,'x1') = {1};
x(:,'x2') = {2};
x(:,'x3') = {3};
%The Variables are from a time series, so I'm trying to grab the t-2, t-1, t observations to go in the y vector. It needed to be the same size as the total variable dataset, so I added dummy rows to the beginning.
y(:,'y1') = [dummy_table2;array2table( Variable{a_var,'Var'}.*(GroupLineNum{a_sym,:}>=3)+0.*(GroupLineNum{a_sym,:}<3))];
y(:,'y2') = [dummy_table;array2table(Variable{b_var,'Var'}.*(GroupLineNum{b_sym,:}>=3)+0.*(GroupLineNum{b_sym,:}<3))];
y(:,'y3') = array2table( Variable{c,'Var'}.*(GroupLineNum{c,:}>=3)+0.*(GroupLineNum{c,:}<3));
sp(z,:) = struct2table(spline(z,y{c,:})); % This is where I'm stuck! I can't get this to vectorize. The rest of the code is similar, and basically take the first/second derivative of the spline/pchip/makima fitted lines, then puts it all back together. If I can figure this bit out, hopefully I can finish vectorizing the rest of the code.
pc = pchip(x,y);
mk = makima(x,y);
spx = fnder(sp,1);
pcx = fnder(pc,1);
mkx = fnder(mk,1);
spxx = fnder(sp,2);
pcxx = fnder(pc,2);
mkxx = fnder(mk,2);
output{c,'spx'} = ppval(spx,3);
output{c,'pcx'} = ppval(pcx,3);
output{c,'mkx'} = ppval(mkx,3);
output{c,'spxx'} = ppval(spxx,3);
output{c,'pcxx'} = ppval(pcxx,3);
output{c,'mkxx'} = ppval(mkxx,3);
end

回答(1 个)

Vinayak
Vinayak 2024-5-14
Hi Hayley,
Vectorizing spline functions, especially under various conditions, might not always be straightforward. It might be worth exploring alternative approaches for optimization or vectorization in your case.
For instance, if we maintain the loop for c = 1:height(Var), we can optimize by setting xValues statically as [1,2,3] since they don't change. This not only saves memory but also simplifies the calculation of yValues for cases where GroupLineNum{c} >= 3.
When it comes to calculating splines, doing so conditionally—only when necessary—and assigning missing values otherwise can streamline the process. Additionally, I noticed the use of "dummy_tables" for padding; consider adding them back only if they serve a critical purpose outside the demonstrated scope.
output = array2table(zeros(height(Var), 6), 'VariableNames', {'spx', 'pcx', 'mkx', 'spxx', 'pcxx', 'mkxx'});
% Constants for x values
xValues = [1, 2, 3];
% Loop through each set of points
for c = 1:height(Var)
if GroupLineNum{c} < 3
% Assign missing values if condition is met
output{c, :} = missing;
else
% Extract y values for current set, considering conditions
yValues = Var(max(1, c-2):c) .* (GroupLineNum{max(1, c-2):c} >= 3);
% Skips rows when previous 2 data points doesn’t exist
if numel(yValues) < 3
continue;
end
% Spline and its derivatives
sp = spline(xValues, yValues);
pc = pchip(xValues, yValues);
mk = makima(xValues, yValues);
spx = fnder(sp, 1);
pcx = fnder(pc, 1);
mkx = fnder(mk, 1);
spxx = fnder(sp, 2);
pcxx = fnder(pc, 2);
mkxx = fnder(mk, 2);
% Evaluate derivatives at x = 3
output{c, 'spx'} = ppval(spx, 3);
output{c, 'pcx'} = ppval(pcx, 3);
output{c, 'mkx'} = ppval(mkx, 3);
output{c, 'spxx'} = ppval(spxx, 3);
output{c, 'pcxx'} = ppval(pcxx, 3);
output{c, 'mkxx'} = ppval(mkxx, 3);
end
end
This approach should offer a more streamlined and efficient way to handle your data, especially considering the volume you mentioned (2M+ data points). If performance is still a concern, leveraging parfor for parallel execution might further enhance the process.
I hope this helps!

类别

Help CenterFile Exchange 中查找有关 Splines 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by