Help using the arrayfun() function to apply strsplit() to all entries in a string array

Question

0 个投票

I'm trying to wrap my head around how the arrayfun() function works and would greatly appreciate some help with a specific example:

I have a string array of weather data.

weather_strings = 
  10×1 string array
    "UTC,2140991,49.0"
    "UTC,2140992,49.1"
    "UTC,2140993,49.1"
    ...

I need to extract the values after the second comma (temperatures) as a 1x10 matrix of doubles, [49.0, 49.1, 49.1, ...].

I've figured out a clunky way to do this for a single entry (please let me know if there's a better way).

weather_string = weather_strings(1) % extract only the first entry
weather_string_split = strsplit(weather_string, ',') % apply strsplit() to split on commas
weather_string_split_trim = weather_string_split(:,3) % extract only 3rd column
weather_num_trim = str2num(weather_string_split_trim) % convert from string to double

But I can't seem to figure out how to use arrayfun() to apply that to every entry. I've tried:

weather_strings_split = arrayfun(strsplit(weather_strings,','), weather_strings) % apply stringsplit to split on commas, for all elements?

which gives the error message:

Error using strsplit (line 80)
First input must be either a character vector or a string scalar.
Error in test_window (line 17)
weather_strings_split = arrayfun(strsplit(weather_strings,','), weather_strings)

I'm probably missing something painfully obvious. What is it? I'm still somewhat of a beginner at coding, so I welcome you to explain it to me like I'm 5 years old.

Alternatively, if there's a clever way to extract these numbers directly from this data table (which came directly from a webread() function), I'd love to hear it. Var3 is a cell array.

weather_data_table =
  10×3 table
       Var1         Var2             Var3       
    __________    ________    __________________
    2018-11-26    17:41:25    'UTC,2140991,49.0'
    2018-11-26    17:42:27    'UTC,2140992,49.1'
    2018-11-26    17:43:28    'UTC,2140993,49.1'
...

Again, the goal is to get just the last numbers after the second comma of Var3 into a 1D matrix.

Thanks in advance!

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Star Strider 2018-11-26

在 MATLAB Online 中打开

0 个投票

Try this:

for k1 = 1:size(weather_strings,1)
    Col3(k1,:) = str2double(regexp(weather_strings{k1}, '\d*\.\d*', 'match'));
end
Col3 =
   49.0000
   49.1000
   49.1000

The loop is necessary because regexp is not vectorised. It can only handle one srting at a time.

6 个评论
显示 4更早的评论隐藏 4更早的评论

Colin Wilson 2018-11-27

编辑：Colin Wilson 2018-11-27

在 MATLAB Online 中打开

Awesome, thank you so much Star Strider and Stephen Cobeldick! This works brilliantly for my temperature data.

Is there a way to write a similar regexp function that would isolate the number from the end of the line, regardless of whether or not it contains a decimal point? (Which is why I originally tried to use commas as delimiters.)

I also need the same function to clean up my Humidity data, which has whole integer values.

For example:

weather_strings = 
  10×1 string array
    "UTC,2140991,59"
    "UTC,2140992,61"
    "UTC,2140993,60"
    ...

If the user selects Humidity data instead of Temperature data right now, I get the following error message:

Unable to perform assignment because the indices on the left side are not compatible with the size of the right
side.
Error in clean_data (line 14)
    clean_weather_strings(:,k) = regexp(weather_strings{k}, '\d*\.\d*', 'match');
Error in Lab7 (line 23)
clean_weather_doubles = clean_data(weather_data_table) % give input to clean_data function, save output

I assume this is because our '\d*\.\d*' expression looks for digits separated by a period. I'm just not familiar enough with the syntax of the regexp() function to know how to set it up differently.

Thanks again!

Stephen23 2018-11-28

Another easy solution: '\d+\.?\d*'

Star Strider 2018-11-28

Noted. Thank you.

请先登录，再进行评论。

Answer 2

Andrei Bobrov 2018-11-28

编辑：Andrei Bobrov 2018-11-28

0 个投票

In R2016b:

>> weather_strings = string({'UTC,2140991,49.0'
                           'UTC,2140992,49.1'
                           'UTC,2140993,49.1'})
weather_strings = 
  3x1 string array
    "UTC,2140991,49.0"
    "UTC,2140992,49.1"
    "UTC,2140993,49.1"
>> str2double(regexp(C,'(\d+\.)?\d+$','match','once'))
ans =
           49
         49.1
         49.1
>> 

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Help using the arrayfun() function to apply strsplit() to all entries in a string array

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

6 个评论
显示 4更早的评论隐藏 4更早的评论

更多回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

类别

产品

版本

标签

Community Treasure Hunt

Help using the arrayfun() function to apply strsplit() to all entries in a string array

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

采纳的回答

6 个评论 显示 4更早的评论 隐藏 4更早的评论

更多回答（1 个）

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

类别

产品

版本

标签

另请参阅

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

6 个评论
显示 4更早的评论隐藏 4更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论