Matching ranges of values and inserting new rows in columns
3 次查看(过去 30 天)
显示 更早的评论
Following a previous question, I now need to dive a little deeper:
I have two tables (workspace is attached):
table1 contains the columns "duration", "WP_1", "WP_2":
table 2 contains the columns "WP1", "WP2", "WC":
I need to split WP_1 and WP_2 in table 1 according to WP1 and WP2 in table2 - what does that mean?
Duration and WC need to be matched, but the ranges of WP_1 & WP_2 and WP1 & WP2 sometimes don't fit.
Let me give you an example:
- row3 in table1 for WP_1 and WP_2 has a range of 5 to 115 with a duration of 552.8
--> this perfectly fits into row1 of table2, where we have a range of 0 to 145 with a WC of 0
--> therefor 100% of the duration in table1 can be connected to WC 0
Problem:
- row4 in table1 for WP_1 and WP_2 has a range of 115 to 219 with a duration of 584.7
--> does not fit in just one category:
row1 in table2: range for WP1 and WP2 is 0 to 145 with WC 0
row2 in table2: range for WP1 and WP2 is 145 to 169 with WC 1
row3 in table2: range for WP1 and WP2 is 169 to 1693 with WC 1
Solution:
- split the WP_1 and WP_2 according to WP1 and WP2 and calculate their new duration
- duration will be calculated with the according parts
- WP_1 and WP_2 in table1's row4 will be split and new rows will be inserted:
--> original row4: WP_1: 115 WP_2: 219 duration: 584.7 --> 104 parts (219-115); 1 part =~ 5.62
--> new row4: WP_1: 115 WP_2:145 duration: 168.66 (30 parts (145-115) x 5.62)
--> new row5: WP_1:145 WP_2: 169 duration: 134.93 (24 parts (169-145) x 5.62)
--> new row6: WP_1:169 WP_2: 219 duration: 281.1 (50 parts (219-169) x 5.62)
the new table1 should look like that:
The main problem is the matching and splitting part - calculating the new durations should be quite easy.
Unfortunately as of writing this I have no clue how to match and split - any hints are very much appreciated!
Thanks for your help!
0 个评论
采纳的回答
TADA
2021-8-17
Once you know that table2 is sorted in ascending order and that there are no overlaps and no shenanigans, you can map the positions of table1 entries in table2 windows.
% this will generate a rather large matrix which maps the entries in table1
% to windows in table2 which start before they start.
% the last true flag in each row will be the starting WP1 for this entry
wp1Mask = table1.WP_1 >= table2.WP1';
the same trick can be done with WP2 only using less than because we want to find the window in table2 which ends after the table1 entry ends
After you finish mapping everything up, you can start correlating.
I normally prefer using logical indices over using find, because in most cases it is not necessary.
The problem is find doesn't work on a specific dimention in a matrix like other functions do, so you can either do it with a loop, or you can use this nifty trick which was proposed by Matt J in another post:
% this will find the first index of a table2 window that starts before each
% table1 entry - not what we want actually, but could be useful for WP2
[~, iWP1] = max(wp1Mask, [], 2);
% to achieve what we seek we can use cumsum combined with the same trick
% this will find the index of the last table2 window which starts before each table1
% entry - what we seek.
[~, iWP1] = max(cumsum(wp1Mask, 2), [], 2);
Once you know these indices, we can find which table1 entries fit only one window and which fit in several windows, and then split those
hope this is useful
7 个评论
TADA
2021-8-19
Addmitedly, I didn't read nor run your code, but your thaugt process sounds about right.
There is a simpler way though.. you don't need to test each record, because the indices already define the start and end positions in table2 of each table1 record that needs to be split up.
lets take row 4 which you mentioned as an example:
it starts at 115 and ends at 219.
the corresponding "windows" in table2 are rows 1 (0-145), 2 (145-169) and 3 (169-1693).
You already know from the indices that you calculated that it starts at row 1 and ends at row 3:
iWP1(4) % this should equal 1
iWP2(4) % this should equal 3
These are the indices of the corresponding rows in table2, and they should define the start and end positions of each new row if I understand the idea correctly.
% obviously in your code you would not hard-code 4, but use a loop index
% or something
newValues = table2(iWP1(4):iWP2(4), {'WP1', 'WP2', 'WC'});
next you need to change the WP1 of the first row and WP2 of the last row according to the original values you had from table1 (115 and 219 in stead of 0 and 1693).
now, all that is left is to recalculate the duration accordingly and replace the original row with these three rows.
You may want to add these three rows at the end of the table, and remove all the divided rows together at the end, once table editing is done, as changing the original indices of the rows you need to split will make your life much harder, and is bound to result in nightmarish bugs.
if the order of rows in table 1 is important you can always sort the table after you are done, or instead build up a separate output table as you go along.
更多回答(0 个)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!