Replace String in Table if Substring is Found

5 次查看(过去 30 天)
Hi all,
I am attempting to convert a table column filled with text into categorical data. As an example, for I have a category statuette and several different cells read: "Broken statuette", "Terracotta statuette" or "Statuette". I want all of these to read Statuette. How might I approach this problem?
Thanks

回答(2 个)

dpb
dpb 2022-10-22
id=["Broken statuette", "Terracotta statuette","Statuette"]; % the variable
valset=[id,"Other Stuff"]; % the values allowed in the category
catnames=[repmat("statuette",1,sum(contains(id,'statuette','IgnoreCase',1))) "stuff"]; % what to display for each value 1:1
id=categorical(id,valset,catnames) % apply to the input array and...
id = 1×3 categorical array
statuette statuette statuette
This is barest/crudest way to illustrate -- the input array is what you have; the valueset is all possible values to allow in the categorical variable (can be more or less than unique(id); missing are <undefined> if in the input array). The catnames are the names to be displayed for each element of valuset on a 1:1 match.
categories(id)
ans = 2×1 cell array
{'statuette'} {'stuff' }
shows the categories for the variable consist of the two values in valueset even though the one wasn't in the input array to convert.

Image Analyst
Image Analyst 2022-10-22
You forgot to attach your table. Here is one way though (untested). Assume you have a table with a column called "Statuette"
for row = 1 : height(t) % t is your table variable
if contains(t.Statuette{row}, 'Statuette', 'IgnoreCase', true)
t.Statuette{row} = 'Statuette';
end
end
If you have any more questions, then attach your data and code to read it in with the paperclip icon after you read this:
  2 个评论
Image Analyst
Image Analyst 2022-10-22
I don't think you can "convert" an existing column from one data class to a different kind. You'd have to add a new column with addvars and then fill it up with the categorical values. More complicated so I just replaced the strings, which was simpler.
dpb
dpb 2022-10-22
The above in an existing table could look like
isS=contains(t.Statuette,'Statuette', 'IgnoreCase', true);
t.Statuette(is)='Statuette';
t.Statuette=categorical(t.Statuette);
Doing the above loses the additional data associated with the condition of the items; that may or may not be something to preserve.
"I don't think you can "convert" an existing column from one data class to a different kind."
Au contraire, good buddy! :)
t=convertvars(t,'Statuette','categorical');
will do the deed, too. convertvars doesn't have the ability to pass the additional parameters that are available with categorical illustrated above, however; it would result in the same applied after the string conversion as above, but if used before, then unique(original variable values) would be the categories.
A very neat thing w/ addvars (and tables in general) is the vartype class that would let you do all variables of class('string') without needing to know the column names/locations.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Data Type Conversion 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by