How to store data when a string condition of another column is met?

1 次查看(过去 30 天)
I have imported an excel file, which has three columns: EVENT_ID, BEGIN_DATE, STATE_ABBR.
EVENT_ID BEGIN_DATE STATE_ABBR
1 01-Apr-2000 CT
2 02-Jun-2000 CT
3 13-Aug-2000 CT
4 29-Oct-2000 CT
5 31-Apr-2002 CT
6 06-Jun-2009 CT
7 08-Jan-2011 DE
8 09-Dec-2016 DE
9 11-Feb-2019 MA
10 20-Aug-2020 MD
11 19-Nov-2020 MD
12 22-Apr-2021 MD
. . .
. . .
. . .
I want to store the number of times (COUNT) a year (YEAR) has repeated in BEGIN_DATE corresponding to STATE_ABBR. In this case, the output should look like:
YEAR COUNT STATE_ABBR
2000 4 CT
2002 1 CT
2009 1 CT
2011 1 DE
2016 1 DE
2019 1 MA
2020 2 MD
2021 1 MD
. . .
. . .
. . .
Any idea on how to do this?

采纳的回答

Stephen23
Stephen23 2021-10-15
T = readtable('SED_FULL_NE.xlsx')
T = 1300×34 table
EVENT_ID CZ_NAME_STR BEGIN_LOCATION BEGIN_DATE EVENT_TYPE MAGNITUDE TOR_F_SCALE DEATHS_DIRECT INJURIES_DIRECT DAMAGE_PROPERTY_NUM DAMAGE_CROPS_NUM STATE_ABBR CZ_TIMEZONE MAGNITUDE_TYPE EPISODE_ID CZ_TYPE CZ_FIPS WFO INJURIES_INDIRECT DEATHS_INDIRECT SOURCE FLOOD_CAUSE TOR_LENGTH TOR_WIDTH BEGIN_RANGE BEGIN_AZIMUTH END_RANGE END_AZIMUTH END_LOCATION END_DATE END_TIME BEGIN_LAT BEGIN_LON END_LAT __________ ______________________________ ______________ ___________ ___________ __________ ___________ _____________ _______________ ___________________ ________________ __________ ___________ ______________ __________ _______ _______ _______ _________________ _______________ ________________________ ___________ __________ __________ ___________ _____________ __________ ___________ ____________ ___________ ________ __________ __________ __________ 5.2897e+06 {'NORTHERN FAIRFIELD (ZONE)' } {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 5 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2897e+06 {'SOUTHERN NEW LONDON (ZONE)'} {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 12 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2897e+06 {'SOUTHERN MIDDLESEX (ZONE)' } {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 11 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2897e+06 {'SOUTHERN NEW HAVEN (ZONE)' } {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 10 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2897e+06 {'SOUTHERN FAIRFIELD (ZONE)' } {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 9 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2897e+06 {'NORTHERN NEW LONDON (ZONE)'} {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 8 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2897e+06 {'NORTHERN MIDDLESEX (ZONE)' } {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 7 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2897e+06 {'NORTHERN NEW HAVEN (ZONE)' } {0×0 char} 01-Apr-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1327e+06 {'Z'} 6 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 30-Apr-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'SOUTHERN FAIRFIELD (ZONE)' } {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 9 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'SOUTHERN MIDDLESEX (ZONE)' } {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 11 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'NORTHERN NEW LONDON (ZONE)'} {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 8 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'SOUTHERN NEW HAVEN (ZONE)' } {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 10 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'NORTHERN MIDDLESEX (ZONE)' } {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 7 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'NORTHERN NEW HAVEN (ZONE)' } {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 6 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'SOUTHERN NEW LONDON (ZONE)'} {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 12 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char} 5.2968e+06 {'NORTHERN FAIRFIELD (ZONE)' } {0×0 char} 01-May-2002 {'Drought'} {0×0 char} {0×0 char} 0 0 0 0 {'CT'} {'EST'} {0×0 char} 1.1349e+06 {'Z'} 5 {'OKX'} 0 0 {'OTHER FEDERAL AGENCY'} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} {0×0 char} 31-May-2002 2359 {0×0 char} {0×0 char} {0×0 char}
[G,YEAR,STATE_ABBR] = findgroups(year(T.BEGIN_DATE),T.STATE_ABBR);
COUNT = accumarray(G,ones(numel(G),1));
out = table(YEAR,COUNT,STATE_ABBR) % you can SORTROWS this if required
out = 39×3 table
YEAR COUNT STATE_ABBR ____ _____ __________ 2000 4 {'DE'} 2000 5 {'MD'} 2000 20 {'NJ'} 2000 11 {'PA'} 2001 20 {'DE'} 2001 26 {'MD'} 2001 85 {'ME'} 2001 60 {'NH'} 2001 94 {'NJ'} 2001 29 {'NY'} 2001 41 {'PA'} 2002 24 {'CT'} 2002 44 {'DE'} 2002 55 {'MD'} 2002 61 {'NY'} 2005 4 {'DE'}
  4 个评论
Stephen23
Stephen23 2021-10-15
编辑:Stephen23 2021-10-15
COUNT = accumarray(G,ones(numel(G),1));
counts the ones corresponding to each group in G:
[G,YEAR,STATE_ABBR] = findgroups(T.BEGIN_DATE.Year,T.STATE_ABBR);
returns a group number for each unique pair of data-points in the two input arguments.

请先登录,再进行评论。

更多回答(1 个)

KSSV
KSSV 2021-10-15
编辑:KSSV 2021-10-15
T = readtable('https://in.mathworks.com/matlabcentral/answers/uploaded_files/768171/SED_FULL_NE.xlsx') ;
id = [1 4 12] ;
T = T(:,id) ;
thedates = T.(2) ;
[theyear,ia,ib]=unique(year(thedates)) ;
count=accumarray(ib,1) ;
state_abbr = T.(3)(ia) ;
T = table(theyear,count,state_abbr)
T = 12×3 table
theyear count state_abbr _______ _____ __________ 2000 40 {'DE'} 2001 355 {'DE'} 2002 184 {'CT'} 2005 9 {'DE'} 2007 76 {'DE'} 2008 96 {'DE'} 2010 59 {'MD'} 2012 96 {'CT'} 2015 39 {'NJ'} 2016 156 {'CT'} 2017 53 {'CT'} 2020 137 {'CT'}
  10 个评论
Rabeca Mohammed
Rabeca Mohammed 2021-10-15
Could you please explain the following codes?
[year,ia,ib]=unique(thedates.Year) ;
count=accumarray(ib,1);
state_abbr = T.(3)(ia) ;
What are ia, ib? Why is ib in the accumarray function? Why is 1 in it too? What is T.(3)(ia)?
Rabeca Mohammed
Rabeca Mohammed 2021-10-15
My bad, the ouput is wrong. In the Excel file, for the year 2000 (BEGIN_DATE = 2000), DE (STATE_ABBR = DE) has only 4 entries. So, count should be 4, not 40. COUNTs of other STATE_ABBRs are also inconsistent to the excel file.

请先登录,再进行评论。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by