Reading csv file and counting number of lines that have a category of interest and are below a threshold

3 次查看(过去 30 天)
fName = 'SampleDataset03.csv';
myCat = 'A';
maxValue = 0.5;
fConn = fopen(fName, 'r');
firstLine = fgetl(fConn);
final = 0;
while ~feof(fConn)
cLine = fgetl(fConn);
parts = strsplit(cLine, ',');
F = strcmp(myCat, parts(:,1));
T = cellfun(@(x) x < maxValue, parts(:,2));
if F
if maxValue > parts(:,2)
final = final + 1;
end
end
end
fclose(fConn);
I am writing a function that reads a csv file and counts all of the lines that have an 'A' and are below 0.5. I have the general framework of the code but I am struggling with comparing the lines in the csv file with the 0.5 threshold. I have tried str2double to get a numeric value but it seems that if I use that the first column with a letter causes an error. I have tried cellfun but that still does not seem to work. Can you help?
Sample csv file:

采纳的回答

dpb
dpb 2021-2-22
编辑:dpb 2021-2-22
>> data=readtable('SampleDataset03.csv');
>> sum(contains(data.Category,'A')&data.Value<0.5)
ans =
64
>>
For this you don't really need a separate function, just define an anonymous function in the calling routine and read the file inline as you go.
The anonymous function could be
fnCountEm=@(t,c,v) sum(contains(t.Category,c)&t.Value<v);
and use like
>> fnCountEm(data,'A',0.5)
ans =
64
>> fnCountEm(data,'X',0.25)
ans =
0
>> fnCountEm(data,'G',1.25)
ans =
119
>>
  2 个评论
dpb
dpb 2021-2-22
ADDENDUM:
Reading the file outside a function would become more significant if this lookup is being done more than just once--that way one has to reread the file every time; this way the data are already in hand and only read once.
One could, of course, create the function with the data as persistent, but that adds the complexity of needing code to clear/recreate if the input data are to be changed at any time.
All in all, in this case it just looks cleaner to me to read the data and do what need there...
dpb
dpb 2021-2-22
ADDENDUM SECOND:
fnCountEm=@(t,c,v) sum(contains(t.Category,c)&t.Value<v);
lets one redefine the table and call the function with new data at will; if the data are indeed unchanging, then could encapsulate it inside the anonymous function and remove from the argument list.
fnCountEm=@(c,v) sum(contains(data.Category,c)&data.Value<v); % data the table in memory
Then, of course, if the data are ever changed one must redefine the anonymous function to reflect that.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Variables 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by