Split array into chunks based on trigger values in another array

Question

0 个投票

Hi all

I've searched for a bit, found some related questions, but none that want to achieve exactly what i want (point me towards one that does, if you know of one):

However, what I am looking for is to split an array A into chunks that correspond to consecutive ones (or a single one) in array B. Consider the following example:

A = (1:10)';
B = [0 1 1 1 0  0 1 0 1 1 ]';
f = @(trig, data) % my magic function
% output of f(B,A) should be the following:
>> f(B, A)
ans = { [2, 3, 4]; [7]; [9, 10] }

I've come up with a working solution, but it looks like it can be done more efficiently, or faster. Hit me with Ideas :)

function groups = f(trig, data)
        % approach with splitapply
        len = length(trig);
        % find rising edges (diff = 1) and falling edges (diff = -1)
        d = zeros(len,1);
        d(2:len) = diff(trig);
        % multiply with increasing numbers to generate unique keys
        g = d .* ((2:len+1).');
        % apply cumsum to assign same key to samples between triggers
        gs = cumsum(g);
        % put NaN for negative keys (after falling edges -> where trigger is 0)
        % so splitapply will ignore those samples
        gs(gs < 1) = NaN;
        % use findgroups to generate consecutive keys
        gr = findgroups(gs);
        % function that returns the array in a cell
        f = @(a) {a};
        
        % let splitapply do the work
        groups = splitapply(f,data,gr);
end

Cheers

Manuel

Edit: Changed Example for more clarity

4 个评论
显示 2更早的评论隐藏 2更早的评论

Just Manuel 2021-2-11

编辑：Just Manuel 2021-2-16

在 MATLAB Online 中打开

A follow-up: I rewrote my first, straigthtforward approach that took ages to compute:

function groups = f2(trig, data)
    % approach with for-loop
    len = length(trig);
    groups = {};
    oldTrigger = 0;
    ngrp = 0;
    for n = 1:len
        % get trigger value
        newTrigger = trig(n);
        % rising edge
        if (1 == newTrigger) && (0 == oldTrigger)
            % increment group index
            ngrp = ngrp + 1;
            % initialize new group
            curGroup = [];
            % reset sample index
            i = 1;
        end
        % trigger high
        if (1 == newTrigger)
            % add sample to group
            curGroup(i,1) = data(n);                    %#ok<AGROW>
            % increment sample index
            i = i + 1;
        end
        %falling edge or last sample
        if (0 == newTrigger) && (1 == oldTrigger) || (n == len)
            % add group to output
            groups{ngrp,1} = curGroup;                  %#ok<AGROW>
        end
        % remember trigger value
        oldTrigger = newTrigger;
    end
end

I suspected the two array growing statements to be the culprit of the bad performance. Well, I then used a short benchmark:

A = rand(1e6,1);
B = rand(1e6,1) > 0.5;
t0 = tic;
g = f(B,A);
t1 = toc(t0);
disp(numel(g));
fprintf('f took %gs\n', t1);
t0 = tic;
g = f2(B,A);
t1 = toc(t0);
disp(numel(g));
fprintf('f2 took %gs\n', t1);

Which yielded the following output:

249706

f took 5.17879s

249707

f2 took 1.1579s

Which surprised me. I must have done something differently in my original approach... Also, thanks to the simple comparison of the number of output elements, I found that f does not include the first group, if trigger starts high (i have not noticed that, as in my application, trigger data always start with 0).

Well, now I'm even more curious if you have Ideas for improvement!

Cheers

Manuel

Just Manuel 2021-2-16

在 MATLAB Online 中打开

Hi Mathieu

Thank you for your Input!

It seems you interpreted my question as asking the same as the two questions I linked. Indeed, using A(B>0.5) would be the simple answer, if i wanted the data values to be in a single vector ( ans = 2 4 5 7 8 10).

However, i want them to be in separate arrays based on consecutive (or, as you pointed out, also one single 1) ones in the trigger values.

Thank you for your code. I used "find" before, but have not thougtht of using it for determining the edge indices. I adapted your code so it does what i want:

function groups = f3(trig, data)
    % approach with M Noe
    S = trig - 0.5;
    % first look for exact zeros
    ind0 = find( S == 0 ); 
    % then look for zero crossings between data points
    S1 = S(1:end-1) .* S(2:end);
    ind1 = find( S1 < 0 );
    % bring exact zeros and "in-between" zeros together 
    ind = sort([ind0 ind1]);
    for ii=1:length(ind)
        DEN = (S(ind(ii)+1) - S(ind(ii)));
        slope_sign(ii) = sign(DEN);
    end
    % extract the positive slope crossing points
    ind_pos = ind1(slope_sign>0);
    % extract the negative slope crossing points 
    ind_neg = ind1(slope_sign<0);
    groups = {};
    gr_ind = 1;
    neg_offset = 0;
    % if trigger signal starts high
    if ind_neg(1) < ind_pos(1)
        groups{1} = data(1:ind_neg(1));
        gr_ind = 2;
        neg_offset = 1;
    end
    % build groups body (excluding last trigger index)
    for i = 1:length(ind_pos)-1
        groups{gr_ind,1} = data(ind_pos(i)+1:ind_neg(i+neg_offset));
        gr_ind = gr_ind+1;
    end
    % if trigger signal ends high
    if ind_neg(end) < ind_pos(end)
        groups{end+1} = data(ind_pos(end)+1:end);
    else
        groups{end+1} = data(ind_pos(end)+1:ind_neg(end));
    end
end

And see, it actually does perform better, than my solution with the for-loop. So, even if you seem to have misunderstood my intention, you have contributed a solution with a better performance :D Thanks!

benchmark output:

250642

f1 took 4.63893s

250643

f2 took 0.960263s

250643

f3 took 0.510538s

Mathieu NOE 2021-2-23

Glad it helped !

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Split array into chunks based on trigger values in another array

4 个评论
显示 2更早的评论隐藏 2更早的评论

回答（0 个）

类别

产品

标签

Community Treasure Hunt

Split array into chunks based on trigger values in another array

4 个评论 显示 2更早的评论 隐藏 2更早的评论

回答（0 个）

类别

产品

标签

另请参阅

Community Treasure Hunt

4 个评论
显示 2更早的评论隐藏 2更早的评论