Use splittapply with division

1 次查看(过去 30 天)
Hi,
I have data of total 76 stocks over a year. I would like to normalize the data of each stock by dividing the whole stock time series by its first entry.
With only one stock it works like that:
D1990 = D(D.year==1990 & D.gvkey==15497,:);
D1990.pricenorm = D1990{:,"priceadj"}./D1990{1,"priceadj"};
The data looks like this.
where gvkey is the unique stock ID and priceadj is the price of the stock each day.
and the athohr variables are just some date variables.
So my idea was to do it with splitapply but unfortunately I don't get it to work.
[group1, ID] = findgroups(D1990.gvkey);
x = splitapply(@(x,y) x./y, D1990{:,"priceadj"}, D1990{1,"priceadj"} group1);
I think using the ID as group doesn't work and I'm also not sure if I use the function in splitapply correctly.
I also attached the acutal file.
Does someone know how to fix it?
Thank you in advance.

采纳的回答

Mario Malic
Mario Malic 2023-9-27
Hey, is this what you are looking for?
load D1990.mat
[group1, ID] = findgroups(D1990new.gvkey);
y = splitapply(@(x) {x./x(1)}, D1990new.priceadj, group1)
D1990new.priceadjNorm = cell2mat(y)

更多回答(1 个)

dpb
dpb 2023-9-27
编辑:dpb 2023-9-28
@Mario Malic fixed the problem w/ splitapply; you only wanted to divide by the first element of the group (which is a scalar so don't need the "dot" divide operator here -- doesn't hurt anything to use and is probably best practice to do so, but isn't required here.
An alternative to illustrate some other newer features of tables...
load D1990
tD=D1990new; % get a short name for convenience
clear D1990new
tD=addvars(tD,cell2mat(rowfun(@(p)p/p(1),tD,'GroupingVariables',{'gvkey'},'InputVariables',{'priceadj'}, ...
'OutputVariableName',{'pricenorm'},'OutputFormat','cell')), ...
'After','priceadj','NewVariableNames',{'pricenorm'});
format bank
head(tD)
gvkey date month year monthyear monthyear_1 priceadj pricenorm ________ ___________ _____ _______ _________ ___________ ________ _________ 15497.00 30-Jan-1990 1.00 1990.00 Jan-1990 Jan-1990 1908.18 1.00 15497.00 13-Feb-1990 2.00 1990.00 Feb-1990 Feb-1990 1908.18 1.00 15497.00 23-Feb-1990 2.00 1990.00 Feb-1990 Feb-1990 1799.55 0.94 15497.00 26-Feb-1990 2.00 1990.00 Feb-1990 Feb-1990 1804.27 0.95 15497.00 28-Feb-1990 2.00 1990.00 Feb-1990 Feb-1990 1799.55 0.94 15497.00 01-Mar-1990 3.00 1990.00 Mar-1990 Mar-1990 1794.82 0.94 15497.00 06-Mar-1990 3.00 1990.00 Mar-1990 Mar-1990 1790.10 0.94 15497.00 07-Mar-1990 3.00 1990.00 Mar-1990 Mar-1990 1794.82 0.94
  1 个评论
Luca
Luca 2023-9-28
编辑:Luca 2023-9-28
Thank you very much this works too. I wasn't aware of the function addvars its cool to learn something new.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Data Preprocessing 的更多信息

标签

产品


版本

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by