extracting information from tall timetable using a loop

4 次查看(过去 30 天)
I'm trying to extract certain time ranges from a tall timetable using a loop and I'm wondering how to do that most efficiently. In particular, gathering the data costs a lot of time and I want to avoid doing that withing every cycle of the loop.
My idea for the code looks like that at the moment, which doesnt work when it comes to calculations at the end. (Gathering in the loop works but takes forever)
location = 'C:\Folder'
ds = datastore(location)
TT = tall(ds)
x = {};
tic
for i =
Strt = minutes(RTImport.Start(i)) %searching the start point for extraction froam another table
endT = Strt + minutes(8) %calculate end time for extration
S = timerange(Strt,endT,'closed') %defining the timerange
TT8 = TT(S,:) %pull the information from the tall TT
Av = mean(TT8.variable,'omitnan') %doing some calculations
x{i} = Av %writing the result x(i)
end
toc
gather(x) %trying to perfom all calculation from tall table at once, but this doesnt work
location = 'folder'
write(location,x) %write is not supported for x
I'd be interested in doing this most efficiently and also if someone could point out the syntax on how to perform the calculation of the mean for several columns (mean of each individual column) in a timetable, that would be most obliged.

采纳的回答

Sindar
Sindar 2020-11-1
Looks like you could do everything with groupsummary, assuming you can figure out how to define the bins
G = groupsummary(TT,'TotalItemsSold',groupbins,'mean',["Var1";"Var2";"Var3"]);
I don't have much experience with datetimes, but spitballing some ideas if ranges don't overlap:
  • create a Nx2 matrix of start-end times
  • flatten into a list of bin edges
  • throw out bins made up of end-start (worst case, you might need to do this after computing means)
  3 个评论
Sindar
Sindar 2020-11-1
编辑:Sindar 2020-11-1
If you still end up needing to defer an array of results, this ended up working for me:
% run the defered operations to compute Phases
% the trick: each cell of Phases contains the recipe for a defered
% operation. gather runs each recipe, so Matlab knows the answers
% but, it isn't immediately stored in the variable
% this will take a while, but seems to be the fastest way
gather(x{:})
% update x variable by looking at the answers stored above, then
% reshaping to the correct matrix
x=reshape(gather([x{:}]),size(x));
Lutetium
Lutetium 2020-11-1
the averages of the all the columns, I managed to perform using this code (including skipping the NaNs):
func = @(x) mean(x,'omitnan'); %ignoring NaN
varfun(func,TT8,'OutputFormat','table')
I'm running now the code gathering the inof in the loop since I need some results on monday and so far I seem to make it :)
Definetively, I'll try your approach for future data exctraction. That seems to be exactly for waht I was looking for. I appreciate your help! Thanks

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Data Preprocessing 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by