Csv file: reading values with comma
45 次查看(过去 30 天)
显示 更早的评论
Hi dear friends,
I need to work with csv file from this website: "https://ds.50hertz.com/api/WindPowerActual/DownloadFile?fileName=2020.csv"
Matlab is not able to read data with comma, and so the result is NaN.
I did the following code to import the data from the web, and to fix the semicolon separator:
%% Get the file from the web and save the file in the path
url = 'https://ds.50hertz.com/api/WindPowerActual/DownloadFile?fileName=2020.csv';
filename = 'WP_50hz.csv';
options = weboptions('Timeout',60);
outfilename = websave(filename,url,options);
% Function to fix the file
[Wind_data] = adjustment_file(filename);
%% Adjustment_file
function [T] = adjustment_file(FILE)
% SECTION 1: Clean up and rewrite the file
% Read in the entire file as text
file = FILE;
txt = fileread(file);
% Remove all null characters
txtClean = regexprep(txt,char(0),'');
% Re-write the file with a new name
% * If "file" contains the full path, you'll need to change
% this line below to rename the file.
fid = fopen(['clean_',file],'w');
fprintf(fid, '%s',txtClean);
fclose(fid);
newfile = ['clean_',file]; % Use the full path whenever possible!
opts = delimitedTextImportOptions("NumVariables", 4);
% Specify range and delimiter
opts.DataLines = [6, Inf];
opts.Delimiter = ";";
% Specify column names and types
opts.VariableNames = ["Datum", "von", "bis", "MW"];
opts.VariableTypes = ["datetime", "datetime", "datetime", "double"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Specify variable properties
opts = setvaropts(opts, "Datum", "InputFormat", "dd.MM.yyyy");
opts = setvaropts(opts, "von", "InputFormat", "HH:mm");
opts = setvaropts(opts, "bis", "InputFormat", "HH:mm");
% Import the data
T = readtable(newfile, opts);
end
How can I handle the comma in the data to be able to read the values?
Thank you very much ; )
0 个评论
回答(1 个)
Jeremy Hughes
2020-9-18
编辑:Jeremy Hughes
2020-9-21
I think you need to set the decimal separator and thousands separator. Also, duration may be the better type for two of the variables.
Also, you can just supply UTF-16 I think instead of needing to do the file cleaning.
opts = delimitedTextImportOptions("NumVariables", 6);
% Specify range and delimiter
opts.DataLines = [6, Inf];
opts.Delimiter = ";";
opts.Encoding = "UTF-16";
% Specify column names and types
opts.VariableNames = ["Datum", "von", "bis", "MW", "OnshoreMW", "OffshoreMW"];
opts.VariableTypes = ["datetime", "duration", "duration", "double", "double", "double"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Specify variable properties
opts = setvaropts(opts, "Datum", "InputFormat", "dd.MM.yyyy");
opts = setvaropts(opts, ["von", "bis"], "InputFormat", "hh:mm");
opts = setvaropts(opts, 4:6, "DecimalSeparator",",","ThousandsSeparator",".");
% Import the data
T = readtable(file, opts);
4 个评论
Walter Roberson
2020-9-21
That is a warning about UTF-16, not an error message. In the release you were using, officially UTF-16 was not supported, but in practice it worked fine.
As of R2020a, MATLAB will automatically detect the UTF-16LE encoding.
Jeremy Hughes
2020-9-21
Also, the spaces in the variable names won't work in some earlier releases. Editing the steps to make sure those will work.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Data Type Conversion 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!