Summing over observations in unbalanced panel data
4 次查看(过去 30 天)
显示 更早的评论
Hello, I have an unbalanced panel data set. For each id, I'd like to sum all values of x up to the latest time I observe that id, and record the summation to a new variable. How can I do this? Ideally, I'd like to avoid looping as I have a large dataset and I try to speed up the process.
Thank you in advance!
Selcen
3 个评论
dpb
2024-8-17
As @the cyclist notes, w/o a sample dataset we're pretty-much without recourse to a direct reponse, but look at rowfun and/or splitapply
采纳的回答
Shishir Reddy
2024-8-20
编辑:Shishir Reddy
2024-8-20
Hi Selcen
As per my understanding you would like to sum all the values of a specific variable till the latest occurrence of that variable in an unbalanced panel data set and record the summation to a new variable.
Assuming that the data is in a MATLAB table format with at least 3 columns ‘id’, ‘time’, and ‘x’, the following is a sample MATLAB code to achieve this.
% Sample unsorted data
data = table([2; 1; 3; 2; 1; 3; 1], [1; 3; 2; 2; 1; 1; 2], [5; 20; 30; 15; 10; 25; 35], 'VariableNames', {'id', 'time', 'x'});
data = sortrows(data, {'id', 'time'}); %Sort the table by 'id' and 'time'
[~, idx] = unique(data.id, 'last'); %Find the maximum time for each 'id'
data.cumSumX = cumsum(data.x); %Calculate the cumulative sum of 'x' for each 'id'
latestCumulativeSum = data.cumSumX(idx); %Extract the cumulative sum at the latest time for each 'id'
result = table(data.id(idx), latestCumulativeSum, 'VariableNames', {'id', 'LatestCumSumX'});
% Display the result
disp(result);
For more information regarding the ‘cumSum’ function kindly refer the following documentation https://www.mathworks.com/help/matlab/ref/cumsum.html
I hope this helps.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Logical 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!