Extract and code gender string as number using if loop

10 次查看(过去 30 天)
I have data for 200 participants stored in a 1x1 structure with 3 fields (time data, gender and age). I'm trying to include lines of script that will extract the gender of each participant and code their gender as 0 (non-binary), 1 (female), or 2 (male). I'm also assigning an ID and extracting age, code so far is below:
for i = 1:size(files);
id = str2num(files(i).name(7:size(files(i).name,2)-4));
tempfile = load(files(i).name);
age = tempfile.data.age;
end
I don't have experience in using an if-loop on strings and am struggling to find the best command to use. I have tried the below, as this is what I would do if extracting and coding numbers, but obviously this doesn't work for these strings as they are not matched in size. I've also played around with STRCMP and STRNCMP but with no luck.
if tempfile.data.gender == 'non-binary'
gender = 0;
elseif tempfile.data.gender == 'female'
gender = 1;
elseif tempfile.data.gender == 'male'
gender = 2;
A little help would be much appreciated!

采纳的回答

Star Strider
Star Strider 2019-11-20
I do not udnerstand the reason strcmp or strncmp would not work with your structure data. We may have to see ‘tempfile.data.gender’ in order to understand the problem.
Meanwhile, consider strcmpi to do case-insensitive comparisons.
  5 个评论
Jen
Jen 2019-11-21
Ah yes, that works perfectly and much more concise than my method. Thank you!

请先登录,再进行评论。

更多回答(1 个)

Bob Thompson
Bob Thompson 2019-11-20
Just to clarify, you have 'gender' as two separate variables, once as a variable called 'gender' which will have a numeric assignment, and once as a field in a nested structure?
strcmp and == will ultimately have the same result, but you are going to run into the size issue on both of them. I work around this by having a second condition to be met for size.
if length(tempfile.data.gender) > 6 & tempfile.data.gender == 'non-binary'
gender = 0;
elseif length(tempfile.data.gender) > 4 & tempfile.data.gender == 'female'
gender = 1;
elseif length(tempfile.data.gender) == 4 & tempfile.data.gender == 'male'
gender = 2;
In a similar manner, if those are guaranteed to be the only possible responses, you can just use the size comparison to assign the value.
  1 个评论
Jen
Jen 2019-11-21
(reposted as comment)
Thanks for posting! Yes you're right in that there are two variables called gender at play here - one nested in my data structure, the other in my output as a coded version of the first.
I ran your suggestion but still get the error 'Matrix dimensions must agree' though I can see why this workaround should work, as in theory matlab would only be comparing strings of 6 or more for the first loop etc.
In the mean time though I have come up with the following solution which gives me the output I was hoping for:
STR = tempfile.data.gender;
for p = "n"
if startsWith(STR,p) == 1
gender = 0;
end
end
for p = "f"
if startsWith(STR,p) == 1
gender = 1;
end
end
for p = "m"
if startsWith(STR,p) == 1
gender = 2;
end
end
I'm looking to make this work with the most concise code possible so any thoughts on this are very welcome! I know there's always many ways to achieve the same goal in Matlab, and my code may be a little clunky though it gets the job done.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Data Preprocessing 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by