How to iterate through rows in excel spreadsheet and sum specific data from there?

2 次查看(过去 30 天)
I have a 3 column, 23829 row spreadsheet in excel. The top rows of the spreadsheet are below.
COUNTY Total Shift Total Work Time
1000 0 0
1000 205 300
1000 30 30
1000 60 345
1000 0 0
1000 0 0
1000 0 0
1000 0 0
1000 0 0
1000 0 0
1000 18 468
1000 0 0
1000 60 487
1000 0 0
1000 0 0
1000 0 495
1000 0 0
1000 0 0
1000 0 0
1000 0 0
1000 0 0
1000 0 0
1000 0 0
1000 120 570
1003 0 0
1003 0 450
1003 0 0
1003 0 0
1003 0 0
1003 195 495
1003 60 455
1003 0 0
1003 0 0
1003 15 255
1003 144 699
1003 0 0
1003 65 575
1003 15 540
1003 30 380
1003 83 603
1003 5 495
1003 30 300
1003 0 0
1003 0 380
1003 103 211
1003 0 0
1003 37 40
1003 0 0
1003 0 0
1003 0 0
1003 300 420
1003 0 0
1003 0 0
1003 0 0
1003 0 0
1003 0 0
1015 0 0
1015 120 675
1015 0 0
1015 40 280
1015 0 37
1015 0 375
1015 0 0
1015 0 0
1015 10 490
1015 0 0
1015 0 515
1015 140 680
1015 0 0
1015 0 0
1015 0 0
1015 0 0
1015 0 0
1015 0 0
1015 0 0
1015 240 660
1015 0 0
1015 0 0
1015 0 0
1015 0 0
1015 0 0
1073 0 0
1073 0 0
1073 0 480
1073 285 810
1073 0 0
1073 0 0
1073 0 0
1073 45 645
1073 141 606
1073 15 495
1073 0 0
1073 0 0
1073 5 5
1073 0 0
1073 0 490
1073 10 520
1073 0 0
1073 0 0
1073 85 175
1073 0 0
1073 0 0
1073 165 650
1073 0 0
1073 0 524
1073 0 0
I have many different counties (starting at FIPS code 1000). I want to add up all the numbers in "Total Shift" within each county, and add up all the numbers in "Total Work Time" for each county. Then, I need to divide the total work time by the total shift, for each county. How can I do this in MatLab?
Ex: Sum all "total shift" and "total work time" for COUNTY "1003" and then divide total shift/total work time. I want the results of the sums of each county in a table with columns: county, total shift, total work time, and fraction shift/work.
  1 个评论
dpb
dpb 2018-7-27
Use readtable to bring the data into a table and then findgroups and splitapply can do whatever processing you wish by county. There are some advantages if you convert the COUNTY codes into categorical, but if they're all integers, that is easy enough to work with as well.

请先登录,再进行评论。

采纳的回答

Albert Fan
Albert Fan 2018-7-27
You can do something like this:
data = readtable('test.xlsx');
total_shift_county1000 = data(data.COUNTY == 1000, 'TotalShift');
total_shift_county1000_sum = sum(table2array((total_shift_county1000)))
you can replace data.COUNTY == 1000 as any county number you wish
  3 个评论
Albert Fan
Albert Fan 2018-7-27
I see. you have two problems. The first one is that you do not want to call readtable() inside the for loop since it will attempt to read the file at each iteration, which is unnecessary. The second one is that since you put outTable = table(geoCode, shiftSum, workSum, fractionShift); inside the for loop, the outTable variable will be overrided at each iteration. If you want to keep all of your results, you should create the outTable outside the for loop by sonething like outTable = table(), and update it by: outTable = [outTable;table(geoCode, shiftSum, workSum, fractionShift)]. However this will create a new issue that all geocodes will be stored in the table, you can avoid that by
if ~sum(data.COUNTY == geoCode) == 0
% your calculations
end
Albert Fan
Albert Fan 2018-7-27
The final result may looks like this:
data=readtable('test.xlsx');
outTable = table()
for geoCode=1000:2000
if ~sum(data.COUNTY == geoCode) == 0
shiftCounty = data(data.COUNTY == geoCode, 'TotalShift');
shiftSum = sum(table2array((shiftCounty)));
workCounty = data(data.COUNTY == geoCode, 'TotalWorkTime');
workSum = sum(table2array((workCounty)));
fractionShift = shiftSum/workSum;
outTable = [outTable;table(geoCode, shiftSum, workSum, fractionShift)];
end
end

请先登录,再进行评论。

更多回答(1 个)

dpb
dpb 2018-7-28
编辑:dpb 2018-7-28
m=readtable('milli.dat');
[g,County]=findgroups(m.COUNTY);
sums=splitapply(@(x,y) sum([x y]),m.Work,m.Shift,g);
s=table(County,sums(:,1), sums(:,2), sums(:,1)./sums(:,2), ...
'VariableNames',{'County','ShiftSum','WorkSum','Ratio'})
s =
4×4 table
County ShiftSum WorkSum Ratio
______ ________ _______ ______
1000 2695 493 5.4665
1003 6298 1082 5.8207
1015 3712 550 6.7491
1073 5400 751 7.1904
>>
I did change the column titles to 'Shift' and 'Work' to be valid ML variable names; names with spaces aren't recognizable.

类别

Help CenterFile Exchange 中查找有关 Tables 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by