Matching ranges of values and inserting new rows in columns

Question

Lukas Netzer 2021-8-17

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1413322-matching-ranges-of-values-and-inserting-new-rows-in-columns

编辑： Lukas Netzer 2021-8-20

采纳的回答： TADA

Following a previous question, I now need to dive a little deeper:

I have two tables (workspace is attached):

table1 contains the columns "duration", "WP_1", "WP_2":

table 2 contains the columns "WP1", "WP2", "WC":

I need to split WP_1 and WP_2 in table 1 according to WP1 and WP2 in table2 - what does that mean?

Duration and WC need to be matched, but the ranges of WP_1 & WP_2 and WP1 & WP2 sometimes don't fit.

Let me give you an example:

row3 in table1 for WP_1 and WP_2 has a range of 5 to 115 with a duration of 552.8

--> this perfectly fits into row1 of table2, where we have a range of 0 to 145 with a WC of 0

--> therefor 100% of the duration in table1 can be connected to WC 0

Problem:

row4 in table1 for WP_1 and WP_2 has a range of 115 to 219 with a duration of 584.7

--> does not fit in just one category:

row1 in table2: range for WP1 and WP2 is 0 to 145 with WC 0

row2 in table2: range for WP1 and WP2 is 145 to 169 with WC 1

row3 in table2: range for WP1 and WP2 is 169 to 1693 with WC 1

Solution:

split the WP_1 and WP_2 according to WP1 and WP2 and calculate their new duration
duration will be calculated with the according parts
WP_1 and WP_2 in table1's row4 will be split and new rows will be inserted:

--> original row4: WP_1: 115 WP_2: 219 duration: 584.7 --> 104 parts (219-115); 1 part =~ 5.62

--> new row4: WP_1: 115 WP_2:145 duration: 168.66 (30 parts (145-115) x 5.62)

--> new row5: WP_1:145 WP_2: 169 duration: 134.93 (24 parts (169-145) x 5.62)

--> new row6: WP_1:169 WP_2: 219 duration: 281.1 (50 parts (219-169) x 5.62)

the new table1 should look like that:

The main problem is the matching and splitting part - calculating the new durations should be quite easy.

Unfortunately as of writing this I have no clue how to match and split - any hints are very much appreciated!

Thanks for your help!

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

TADA 2021-8-17

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1413322-matching-ranges-of-values-and-inserting-new-rows-in-columns#answer_768922

在 MATLAB Online 中打开

Once you know that table2 is sorted in ascending order and that there are no overlaps and no shenanigans, you can map the positions of table1 entries in table2 windows.

% this will generate a rather large matrix which maps the entries in table1
% to windows in table2 which start before they start.
% the last true flag in each row will be the starting WP1 for this entry
wp1Mask = table1.WP_1 >= table2.WP1';

the same trick can be done with WP2 only using less than because we want to find the window in table2 which ends after the table1 entry ends

After you finish mapping everything up, you can start correlating.

I normally prefer using logical indices over using find, because in most cases it is not necessary.

The problem is find doesn't work on a specific dimention in a matrix like other functions do, so you can either do it with a loop, or you can use this nifty trick which was proposed by Matt J in another post:

% this will find the first index of a table2 window that starts before each
% table1 entry - not what we want actually, but could be useful for WP2
[~, iWP1] = max(wp1Mask, [], 2); 
% to achieve what we seek we can use cumsum combined with the same trick
% this will find the index of the last table2 window which starts before each table1
% entry - what we seek.
[~, iWP1] = max(cumsum(wp1Mask, 2), [], 2);

Once you know these indices, we can find which table1 entries fit only one window and which fit in several windows, and then split those

hope this is useful

7 个评论
显示 5更早的评论隐藏 5更早的评论

Lukas Netzer 2021-8-19

在 MATLAB Online 中打开

Thank you again for your answer, this is what I came up (which is not working) but maybe you can tell me if I'm even close to going in the right direction:

wp1Mask = table1.WP1 >= table2.WP1';
wp2Mask = table1.WP2 <= table2.WP2';
[~, iWP1] = max(cumsum(wp1Mask, 2), [], 2);
[~, iWP2] = max(wp2Mask, [], 2); 
for x = 1:size(iWP1)
    if iWP1(x) < iWP2(x)
        WP1 = table1.WP1(x);
        WP2 = table1.WP2(x);
        DURpart = table1.duration / (WP2 - WP1);
        n = iWP2(x)-iWP1(x);
        for y = 0:n
            if y == 0
                newWP1 = [table1.WP1(x+y);table1.WP1(x);table1.WP1(~(x+y))];
                newWP2 = [table1.WP2(x+y);table2.WP2(iWP1(x));table1.WP2(~(x+y))];
            else 
                newWP1 = [table1.WP1(x+y);table1.WP2(x+y-1);table1.WP1(~(x+y))];
                if WP2 <= table2.WP2(iWP2(x)-iWP1(x)+1)
                    newWP2 = [table1.WP2(x+y);WP2;table1.WP2(~(x+y))];
                else 
                    newWP2 = [table1.WP2(x+y);table.WP2(iWP2(x)-iWP1(x)+1);table1.WP2(~(x+y))];
                end
        table1{:,[1,2,3]} = [0,newWP1, newWP2];
        for z = x:x+n
            table1.duration(z) = DURpart * (table1.WP2(z) - table1.WP1(z));
        end
            end
        end
    end
end

My thought process:

We go through the indices and everytime iWP2 is greater than iWP1 we want to create y = 0:(iWP2(x)-iWP1(x)) new rows. For these rows we need to define the new WP1 and WP2 and duration.

The first step is the location. The location is defined by the amount of new rows we need and the row we encountered the problem. So the location is x+y.

The second step would be the value. That's where I am already struggling.

The new WP1 of the added row would be the new WP2 of the new-old row.

But the WP1 of the new-old row would stay the same. How can I define that in one go, or do I need to add if y = 0. Even if I added if y = 0, that would change the values I'd need to calculate the duration. Therefor I'm assuming that needs to be done as a first step. Define WP1 and WP2 before we change that row and calculate the duration parts.

Next we set if y = 0, we have a special case where WP1 is defined as the WP1 we already had, and WP2 as the border of table2 range we are hitting.

If y > 0, the WP1 of the inserted row is defined as the WP2 of the previous row.

The new WP2 needs a new if clause, to check if it's smaller or equal the next border we could possibly hit. If it is smaller or equal the WP2 of this row is the WP2 of the original row, we replaced.

Finally if it is bigger, the WP2 is the border we hit at table2.

We then add the new row, where the colmn of duration has a value of 0, which will be calculated in the last step.

Is this somehow comprehensive?

I've been at it for days and am still struggling really hard - this is a key step I need to solve to proceed with my work!

Thanks for reading and your help!

TADA 2021-8-19

在 MATLAB Online 中打开

Addmitedly, I didn't read nor run your code, but your thaugt process sounds about right.

There is a simpler way though.. you don't need to test each record, because the indices already define the start and end positions in table2 of each table1 record that needs to be split up.

lets take row 4 which you mentioned as an example:

it starts at 115 and ends at 219.

the corresponding "windows" in table2 are rows 1 (0-145), 2 (145-169) and 3 (169-1693).

You already know from the indices that you calculated that it starts at row 1 and ends at row 3:

iWP1(4) % this should equal 1
iWP2(4) % this should equal 3

These are the indices of the corresponding rows in table2, and they should define the start and end positions of each new row if I understand the idea correctly.

% obviously in your code you would not hard-code 4, but use a loop index
% or something
newValues = table2(iWP1(4):iWP2(4), {'WP1', 'WP2', 'WC'});

next you need to change the WP1 of the first row and WP2 of the last row according to the original values you had from table1 (115 and 219 in stead of 0 and 1693).

now, all that is left is to recalculate the duration accordingly and replace the original row with these three rows.

You may want to add these three rows at the end of the table, and remove all the divided rows together at the end, once table editing is done, as changing the original indices of the rows you need to split will make your life much harder, and is bound to result in nightmarish bugs.

if the order of rows in table 1 is important you can always sort the table after you are done, or instead build up a separate output table as you go along.

Lukas Netzer 2021-8-20

编辑：Lukas Netzer 2021-8-20

在 MATLAB Online 中打开

edit:

I got it:

wp1Mask = table1.WP1 >= table2.WP1';
wp2Mask = table1.WP2 <= table2.WP2';
[~, iWP1] = max(cumsum(wp1Mask, 2), [], 2);
[~, iWP2] = max(wp2Mask, [], 2);
xIndex = iWP2 > iWP1;
d=find(xIndex);
table_new = table1;
for x = 1:size(table1)
    if iWP1(x) == iWP2(x)
        continue
    elseif iWP1(x) < iWP2(x)
        newValues = table2(iWP1(x):iWP2(x), {'WP1', 'WP2'});
        newValues.duration(1:size(newValues)) = 0;
        newValues = movevars(newValues, 'duration', 'Before', 'WP1');
        newValues.WP1(1) = table1.WP1(x);
        newValues.WP2(end) = table1.WP2(x);
        DurPart = table1.duration(x) / (table1.WP2(x) - table1.WP1(x));
        newValues.duration = DurPart .* (newValues.WP2 - newValues.WP1);
        table_new = [table_new;newValues];
    end
end 
table1 = table_new;
table1(d,:) = [];
table1 = sortrows(table1,2);

THANK YOU VERY MUCH FOR YOUR INPUT - may all the best come to you! Have a nice weekend!

请先登录，再进行评论。

Matching ranges of values and inserting new rows in columns

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

7 个评论
显示 5更早的评论隐藏 5更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Matching ranges of values and inserting new rows in columns

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

7 个评论 显示 5更早的评论隐藏 5更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

7 个评论
显示 5更早的评论隐藏 5更早的评论