apply string math to everything in a table

4 次查看(过去 30 天)
I have a table where one variable is a char array.
I am trying to scrub excess information out of that array
e.g. ' "h t t p s : / / c a . s t y l e . y a h o o . c o m / 5 - a p p s " '
I will reduce to ' "c a . s t y l e . y a h o o . c o m " '
In general I know I can do that with erase to remove the "h t t p s : / / " and then use strfind to find the next '/' and make a new variable that contains the string up to that address.
But strfind and erase don't seem to happily comply with the table format. Nor can I figure out how to apply strfind to the whole table (as opposed to writing a for loop to step through it).
Is there some way to make these functions work with tables?

采纳的回答

Walter Roberson
Walter Roberson 2020-5-13
scrubbed = regexprep(YourTable.VariableName, {'^([^/]*/){2}\s*', '\s*/.*$'}, {'',''}, 'once', 'lineanchors', 'dotexceptnewline');
This code does not remove the spaces within the url. Doing that would certainly be possible:
scrubbed = regexprep(YourTable.VariableName, {'^([^/]*/){2}\s*', '\s*/.*$', '\s+'}, {'','',''}, 'lineanchors', 'dotexceptnewline');
  10 个评论
Walter Roberson
Walter Roberson 2020-5-15
target = 'uploadable.csv';
opts = detectImportOptions(target, 'encoding', 'utf16le');
t = readtable(target, opts);
Before R2020a you will get warnings about the encoding not being supported, and also a warning about a byte order mark.
warning('off', 'MATLAB:iofun:UnsupportedEncoding')
will get rid of the message about unsupported encoding.
Or you could use
target = 'uploadable.csv';
fmt = ['"%f" "%f" "%f" "%f" "%f" %q %q "%f" "%f" ',repmat('%q ',1,10), '"%f" "%f" %q "%f"'];
fid = fopen(target, 'rt', 'n', 'utf16-le'); %ignore warning about UTF16-LE not being supported
data = textscan(fid, fmt, 'delimiter', '\t', 'headerlines', 1);
fclose(fid)
This will give you a single warning about the encoding not being supported.
If the warning about encoding really bugs you then,
target = 'uploadable.csv';
fmt = ['"%f" "%f" "%f" "%f" "%f" %q %q "%f" "%f" ',repmat('%q ',1,10), '"%f" "%f" %q "%f"'];
fid = fopen(target, 'r');
bytes = fread(fid, [1 inf], '*uint8');
fclose(fid)
s = native2unicode(bytes, 'utf16le');
data = textscan(s, fmt, 'delimiter', '\t', 'headerlines', 1);
.. provided that the files do not occupy more than about 1/3 of your available memory.
Budding MATLAB Jockey
For reference everything worked great. You are my hero!

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 String Parsing 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by