Extracting parts of a string

Question

Ana Maria Alzate 2018-6-15

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/405805-extracting-parts-of-a-string

编辑： Stephen23 2018-6-18

采纳的回答： Paolo

00000090Head.txt

在 MATLAB Online 中打开

I have a text filewith information like this:

FileName; SampleFreq; Test;Modality;Channel;Description;StimIntensity; Position; RecordingTime
C:\Users\G10040419\Desktop\lp export application\Data 139\00000090_1.WAV; 22000; 2;1;1;5 CH Right;    0.00;   -10000; 40147.491374

I need to extract the sampleFreq (22000) and the position (-10000). I tried to use regular expressions, but I cannot find specific delimiter for these data.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Paolo 2018-6-15

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/405805-extracting-parts-of-a-string#answer_324841

编辑：Paolo 2018-6-15

在 MATLAB Online 中打开

The following code uses regexp to extract the data you want. You can play around with the expression here .

 data = fileread('00000090Head.txt');
 expression = '(?<=WAV;\s*)(\d*)(?:;\s*\d*;\d;\d;(.*?(?=;));\s*\d*\.\d*;\s*)(-?\d*)';
 [tokens,match] = regexp(data,expression,'tokens','match');
 sampleFrequency = cellfun(@(x) x(1,1),tokens);
 position = cellfun(@(x) x(1,2),tokens);

Position and sampleFrequency are both 1x183 cell arrays and contain the data you are interested in.

position = {'-10000' '-9000' '-8000' '-7000' '-6000' '-5000' '-4500' '-4000' '-3500' '-3000' ................}

sampleFrequency = {'22000' '22000' '22000' '22000' '22000' '22000' '22000' '22000' '22000' '22000' .................}

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 2

per isakson 2018-6-15

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/405805-extracting-parts-of-a-string#answer_324798

编辑：per isakson 2018-6-15

在 MATLAB Online 中打开

Is this what you are looking for?

fid = fopen( '00000090Head.txt', 'r' );
cac = textscan( fid, '%*s%f%*f%*f%*f%*s%*f%f%*f', 'Headerlines',1,'Delimiter',';' );
fclose( fid );

and inspect the result

>> cac
cac =
  1×2 cell array
    {183×1 double}    {183×1 double}
>> cac{2}(1:3)
ans =
      -10000
       -9000
       -8000

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 3

Ana Maria Alzate 2018-6-15

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/405805-extracting-parts-of-a-string#answer_324799

Yes, but is not giving me the position, it is giving me the lst column, the recording time

5 个评论
显示 3更早的评论隐藏 3更早的评论

Jan 2018-6-15

@Ana Maria Alzate: Please do not post comments in the section for answers in the future. There is a section for comments for this job. Thanks.

Ana Maria Alzate 2018-6-18

Thank you for the advice!

请先登录，再进行评论。

Answer 4

Stephen23 2018-6-18

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/405805-extracting-parts-of-a-string#answer_325067

编辑：Stephen23 2018-6-18

在 MATLAB Online 中打开

Importing the data as strings and then using regular expressions to parse them is inefficient, yet is not required because that file is very nicely formatted in delimited columns, and the required data can easily and efficiently be read directly as numeric (or char). The command textscan makes it easy specify how to read those columns, and the format string is much simpler and more intuitive that those regular expressions:

>> fmt = '%*s%f%*d%*d%*d%*s%*f%f%*f';
>> opt = {'HeaderLines',1,'Delimiter',';'};
>> [fid,msg] = fopen('00000090Head.txt','rt');
>> assert(fid>=3,msg)
>> C = textscan(fid,fmt,opt{:});
>> fclose(fid);
>> [C{:}]
ans =
-10000
 -9000
 -8000
 -7000
 -6000
 -5000
 -4500
 -4000
 -3500
 -3000
 -3000
 -2500
 -2000
 -1500
 -1000
  -500
     0
   500
  1000
  1500
   ... lots of lines here
 -3000
 -2500
 -2000
 -1500
 -1000
  -500
     0
   500
  1000
  1500
  2000
  2500
  3000
  3500
  4000
  4000
  5000