Compare table variables with input list and replace missing vars with NA

2 次查看(过去 30 天)
Hi,
I have a set of data attached in output.csv file which I would like to compare with my masterlist in txt file and see if there's anything missing, if there is a variable missing I want to add it and put its points as NA. How do i do that?
Thanks in advance.

采纳的回答

NIVEDITA MAJEE
NIVEDITA MAJEE 2022-6-30
Hello,
You could do it like this:
[~,output] = xlsread('output.csv', 'A:A'); %reading the Names column from output.csv
masterlist = importdata('masterlist.txt'); %reading the data from masterlist.txt
diff = setdiff(masterlist, output); %finding the variables missing in output.csv after comparing it with the variables in masterlist.txt
NA_cell = cell(size(diff)); %creating a cell array with the same dimension as diff array
NA_cell(:) = {'NA'}; % populating it with 'NA'
diff = [diff NA_cell]; %concatenating NA into the diff array
writecell(diff, 'output.csv', 'WriteMode', 'append'); %appending the diff matrix into the output.csv
Hope this solves your issue!

更多回答(2 个)

Eric Sofen
Eric Sofen 2022-6-30
Readtable and outerjoin do the trick (xlsread is discouraged at this point - readtable is preferred). FYI, there's also a join Live Task that you can use in the Live Editor to explore different join options interactively.
t = readtable("https://www.mathworks.com/matlabcentral/answers/uploaded_files/1050265/output.csv",Delimiter=",");
m = readtable("https://www.mathworks.com/matlabcentral/answers/uploaded_files/1050270/masterlist.txt",Delimiter=",", ReadVariableNames=false);
m.Properties.VariableNames(1) = "Names";
outerjoin(t,m,Keys="Names",MergeKeys=true)
ans = 101×2 table
Names Points ________________ ______ {'abc_0_DivA' } 70 {'abc_100_DivD'} NaN {'abc_10_DivA' } 90 {'abc_11_DivA' } 20 {'abc_12_DivA' } 100 {'abc_13_DivA' } 60 {'abc_14_DivA' } 30 {'abc_15_DivA' } 70 {'abc_16_DivA' } 40 {'abc_17_DivA' } 50 {'abc_18_DivA' } 90 {'abc_19_DivA' } 60 {'abc_1_DivA' } 40 {'abc_20_DivA' } 60 {'abc_21_DivA' } 90 {'abc_22_DivA' } 60
  6 个评论
RD
RD 2022-6-30
2017b and now it gives an error on outerjoin
Error using tabular/outerjoin (line 127)
Invalid parameter name: 'Names'.
Error in Untitled (line 12)
outerjoin(t,m,"Names",true);
RD
RD 2022-7-6
the function outerjoin is still giving me an issue, I will try to look into the documentation more about the function to figure out if there are any user error's on my part. But for the time being @NIVEDITA MAJEE's solution with setdiff function is working for me.
Thanks for the input.

请先登录,再进行评论。


per isakson
per isakson 2022-7-2
This should work with 2017b.
%%
fid = fopen( 'output.csv', 'rt' );
output_csv = textscan( fid, '%s%*f', 'Headerlines',1, 'Delimiter',',' );
[~] = fclose( fid );
fid = fopen( 'masterlist.txt', 'rt' );
masterlist = textscan( fid, '%s' );
[~] = fclose( fid );
%%
lacking = setdiff( masterlist{1}, output_csv{1} );
%%
fid = fopen( 'output.csv', 'a' );
for cac = reshape( lacking, 1,[] )
[~] = fprintf( fid, '%s,%s\r\n', cac{1}, 'NA' );
end
[~] = fclose( fid );
%%
dbtype output.csv 96:102
96 abc_72_DivC,0 97 abc_73_DivC,0 98 abc_91_DivD,0 99 abc_99_DivD,-40 100 abc_100_DivD,NA 101 abc_44_DivB,NA 102 abc_74_DivC,NA

标签

产品


版本

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by