multiple condition sumif without loop
9 次查看(过去 30 天)
显示 更早的评论
I have the following table and want to sum over unique dates and Accounts to get the daily changes per account
TransactionID = {'102001';'102002';'102003';'102004';'102005';'102006';'102007';'102008';'102009';'102010'};
Date = datetime({'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-09';'2021-02-09';'2021-02-09';'2021-02-17'},"InputFormat","uuuu-MM-dd");
AmountDebited = [3000; 0; 0; 3000; 4443; 2.02877864; 430; 50; 200; 0.297598984636403];
AccountDebited = ["TD-CAD";"LC-CAD";"LC-CAD";"SI-CAD";"SP-CAD";"LC-USD";"SP-CAD";"SP-CAD";"SP-CAD";"LC-USD"];
AmountCredited = [3000; 0; 0; 3000; 2.03048665; 4450; 0.18713085; 0.02177471; 0.0869854; 0.295498984636403];
AccountCredited = ["SP-CAD"; "MM-LPT"; "KN-DIR"; "SP-CAD"; "LC-USD"; "SI-CAD"; "LC-USD"; "LC-USD"; "LC-USD"; "MM-USD"];
X = table(TransactionID, Date, AmountDebited, AccountDebited, AmountCredited, AccountCredited);
Right Now I can do it using the following for loop to compute the result but My actual table has thousands of records and the calculation takes forever so have been trying to get it to work using indexes but had no luck so far.
UniqueDebit = unique(X.AccountDebited);
UniqueCredit = unique(X.AccountCredited);
UniqueConc = unique([UniqueDebit;UniqueCredit]);
UniqueDates = unique(X.Date);
for i = 1:length(UniqueDates)
for j = 1:height(UniqueConc)
RowsCredited = find(X.Date == UniqueDates(i) & X.AccountCredited == UniqueConc(j));
RowsDebited = find(X.Date == UniqueDates(i) & X.AccountDebited == UniqueConc(j));
DailyCredited = sum(X.AmountCredited(RowsCredited));
DailyDebited = sum(X.AmountDebited(RowsDebited));
DailyChanges = DailyCredited - DailyDebited;
XMatrix(i,j) = DailyChanges;
end
end
XTable=array2table(XMatrix);
XTable.Properties.VariableNames = UniqueConc;
XTable.Properties.RowNames = string(UniqueDates);
0 个评论
采纳的回答
Jan
2022-5-26
编辑:Jan
2022-5-26
Start with avoiding repeated work:
XMatrix = zeros(length(UniqueDates), height(UniqueConc)); % Pre-allocate!!!
xCredit = X.AccountCredited; % Abbreviation
xDebit = X.AccountDebited;
for i = 1:length(UniqueDates)
tmp = (X.Date == UniqueDates(i)); % Move out of the inner loop
for j = 1:height(UniqueConc)
RowsCredited = (tmp & xCredited == UniqueConc(j));
RowsDebited = (tmp & xDebited == UniqueConc(j));
DailyCredited = sum(X.AmountCredited(RowsCredited)); % [EDITED]
DailyDebited = sum(X.AmountDebited(RowsDebited)); % [EDITED]
XMatrix(i,j) = DailyCredited - DailyDebited;
end
end
Avoid the find(), because logical indexing is faster.
Is XMatrix preallocated?
3 个评论
Jan
2022-5-26
编辑:Jan
2022-5-26
It is hard to optimize code without having realistic input data. I cannot estimate, if an accumarray approach is faster here. Are you sure, that the loop is the bottleneck?
Another idea is to use ismember once, such that Matlab can search in numerical indices instead of string arrays:
xAccDebit = X.AccountDebited;
xAccCredit = X.AccountCredited;
xAmDebit = X.AmountDebited;
xAmCredit = X.AmountCredited;
uDebit = unique(xAccDebit);
uCredit = unique(xAccCredit);
uConc = unique([uDebit; uCredit]);
[uDates, ~, uDateInd] = unique(X.Date);
[~, creditInd] = ismember(xAccCredit, uConc);
[~, debitInd] = ismember(xAccDebit, uConc);
X = zeros(numel(uDates), numel(uConc)); % Pre-allocate
for i = 1:numel(uDates)
tmp = (uDateInd == i); % same as: (X.Date == uDates(i));
mCreditInd = creditInd .* tmp; % mask FALSE in tmp as 0 in vector
mDebitInd = debitInd .* tmp; %
for j = 1:numel(uConc)
RowsCredited = (mCreditInd == j);
RowsDebited = (mDebitInd == j);
DailyCredited = sum(xAmCredit(RowsCredited));
DailyDebited = sum(xAmDebit(RowsDebited));
X(i,j) = DailyCredited - DailyDebited;
end
end
Is numel(uDates) much larger than numel(uConc)? Maybe swapping the loops is faster.
I've edited the code in my first answer: I've abbreviated "X.AccountDebited" and "X.AmountDebited" both by "xDebited".
Are you working with less than 65536 elements? Then it might save some time, if you convert mCreditInd and mDebitInd to UINT16 and the loop e.g. to:
for j = uint16(1):uint16(numel(uConc))
更多回答(1 个)
Seth Furman
2022-6-2
It's worth noting that a lot of these computations can be expressed more succinctly using groupsummary, outerjoin, and fillmissing.
TransactionID = {'102001';'102002';'102003';'102004';'102005';'102006';'102007';'102008';'102009';'102010'};
Date = datetime({'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-06';'2021-02-09';'2021-02-09';'2021-02-09';'2021-02-17'},"InputFormat","uuuu-MM-dd");
AmountDebited = [3000; 0; 0; 3000; 4443; 2.02877864; 430; 50; 200; 0.297598984636403];
AccountDebited = ["TD-CAD";"LC-CAD";"LC-CAD";"SI-CAD";"SP-CAD";"LC-USD";"SP-CAD";"SP-CAD";"SP-CAD";"LC-USD"];
AmountCredited = [3000; 0; 0; 3000; 2.03048665; 4450; 0.18713085; 0.02177471; 0.0869854; 0.295498984636403];
AccountCredited = ["SP-CAD"; "MM-LPT"; "KN-DIR"; "SP-CAD"; "LC-USD"; "SI-CAD"; "LC-USD"; "LC-USD"; "LC-USD"; "MM-USD"];
X = table(TransactionID, Date, AmountDebited, AccountDebited, AmountCredited, AccountCredited);
Find total amounts credited/debited for each date-account pair
credited = groupsummary(X,["Date","AccountCredited"],"sum",["AmountCredited"])
debited = groupsummary(X,["Date","AccountDebited"],"sum",["AmountDebited"])
Match total amounts credited/debited by date-account pair
credited.GroupCount = [];
debited.GroupCount = [];
sums = outerjoin(credited,debited,LeftKeys=["Date","AccountCredited"],RightKeys=["Date","AccountDebited"],MergeKeys=true)
Fill missing data
sums = fillmissing(sums,"constant",0,DataVariables=[3 4])
Compute net change by date-account pair
sums.NetChange = sums.sum_AmountCredited - sums.sum_AmountDebited
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Loops and Conditional Statements 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!