Faster/Simpler way to update my data in a table

43 次查看(过去 30 天)
I have the following code:
H = height(T);
count = 0;
z=num2cell(0);
for i=1:1000
if strcmp(T{i,2},'sage')
count = count+1;
z = num2cell(20*count);
T{i,1} = z;
else
T{i,1} = z;
end
end
And it does precisely what I want it to. My issue is this loop is being applied to a massive table. The table has just over 658k rows in it, and I will need this code to apply to any table I upload, so the row count could potentially be even more for other datasets. After the complete if statement, I added a second if statement of "if i = 1000; break; end" and then I did the "run and time". That portion took about 36 seconds, and I did the math and found that would mean that it takes about 6.7 hours for this one for loop to execute throughout the entire table.
Here is a screenshot of part of the table as an example:
I inlclude this only to show that the string 'sage' shows up at random intervals, so I don't know what commands I could use to skip using a for loop altogether (as that would be the faster option).
  4 个评论
Ruger28
Ruger28 2020-2-18
is the delta between 'grey' always the same? so once you find grey, the index from the first to the next will always be 20?
stone_ward
stone_ward 2020-2-18
No. And I just realized, it was supposed to be 'sage', not 'grey'... I apologize, I changed the character strings to colors as this is for a work project, and got mixed up. I've updated the code above to reflect that. But as you can see in the screenshot I included, sage first occurs on row 19, the second occurance is on row 30, and the third occurance is row 33. It's random.

请先登录,再进行评论。

采纳的回答

Jon
Jon 2020-2-18
编辑:Jon 2020-2-18
Here's an approach that I think does what you want.
You would have to benchmark the run time, but the approach is quite vectorized and I would assume it is quite a bit more efficient than what you were doing
% create some data just to demonstrate
% you would read from your actual table
time = 1:100;
colorIdx = randi([1,4],100,1)
colorChoices = {'red','sage','fuschia','gray'}
color = colorChoices(colorIdx)'
% make the table
T = table(time(:),color)
% make the new column incrementing by 20 for each occurrence of sage
% make a logical vector of indices (rows) where color is gray
idlColorMatch = strcmp(T.color,'sage')
% make the new time column
T.newTime = 20*(cumsum(idlColorMatch))
  2 个评论
Jon
Jon 2020-2-21
Glad to hear that was what you needed. Thanks for letting me know. The cumsum counting of the logical indices comes in handy anytime you need to count occurences, and is so amazingly simple.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Characters and Strings 的更多信息

产品


版本

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by