Csv file: reading values with comma

45 次查看(过去 30 天)
Hi dear friends,
I need to work with csv file from this website: "https://ds.50hertz.com/api/WindPowerActual/DownloadFile?fileName=2020.csv"
Matlab is not able to read data with comma, and so the result is NaN.
I did the following code to import the data from the web, and to fix the semicolon separator:
%% Get the file from the web and save the file in the path
url = 'https://ds.50hertz.com/api/WindPowerActual/DownloadFile?fileName=2020.csv';
filename = 'WP_50hz.csv';
options = weboptions('Timeout',60);
outfilename = websave(filename,url,options);
% Function to fix the file
[Wind_data] = adjustment_file(filename);
%% Adjustment_file
function [T] = adjustment_file(FILE)
% SECTION 1: Clean up and rewrite the file
% Read in the entire file as text
file = FILE;
txt = fileread(file);
% Remove all null characters
txtClean = regexprep(txt,char(0),'');
% Re-write the file with a new name
% * If "file" contains the full path, you'll need to change
% this line below to rename the file.
fid = fopen(['clean_',file],'w');
fprintf(fid, '%s',txtClean);
fclose(fid);
newfile = ['clean_',file]; % Use the full path whenever possible!
opts = delimitedTextImportOptions("NumVariables", 4);
% Specify range and delimiter
opts.DataLines = [6, Inf];
opts.Delimiter = ";";
% Specify column names and types
opts.VariableNames = ["Datum", "von", "bis", "MW"];
opts.VariableTypes = ["datetime", "datetime", "datetime", "double"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Specify variable properties
opts = setvaropts(opts, "Datum", "InputFormat", "dd.MM.yyyy");
opts = setvaropts(opts, "von", "InputFormat", "HH:mm");
opts = setvaropts(opts, "bis", "InputFormat", "HH:mm");
% Import the data
T = readtable(newfile, opts);
end
How can I handle the comma in the data to be able to read the values?
Thank you very much ; )

回答(1 个)

Jeremy Hughes
Jeremy Hughes 2020-9-18
编辑:Jeremy Hughes 2020-9-21
I think you need to set the decimal separator and thousands separator. Also, duration may be the better type for two of the variables.
Also, you can just supply UTF-16 I think instead of needing to do the file cleaning.
opts = delimitedTextImportOptions("NumVariables", 6);
% Specify range and delimiter
opts.DataLines = [6, Inf];
opts.Delimiter = ";";
opts.Encoding = "UTF-16";
% Specify column names and types
opts.VariableNames = ["Datum", "von", "bis", "MW", "OnshoreMW", "OffshoreMW"];
opts.VariableTypes = ["datetime", "duration", "duration", "double", "double", "double"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Specify variable properties
opts = setvaropts(opts, "Datum", "InputFormat", "dd.MM.yyyy");
opts = setvaropts(opts, ["von", "bis"], "InputFormat", "hh:mm");
opts = setvaropts(opts, 4:6, "DecimalSeparator",",","ThousandsSeparator",".");
% Import the data
T = readtable(file, opts);
  4 个评论
Walter Roberson
Walter Roberson 2020-9-21
That is a warning about UTF-16, not an error message. In the release you were using, officially UTF-16 was not supported, but in practice it worked fine.
As of R2020a, MATLAB will automatically detect the UTF-16LE encoding.
Jeremy Hughes
Jeremy Hughes 2020-9-21
Also, the spaces in the variable names won't work in some earlier releases. Editing the steps to make sure those will work.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Data Type Conversion 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by