extract data from EEG text file

Question

D. Ali 2019-4-27

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/458881-extract-data-from-eeg-text-file

编辑： Cedric 2019-5-2

I need help to write script to exatrct MCAP samples with time it occured in seaerate file and plot so I can use these sampes ton signal procsing application on maltlb this is only art of the data , the file contains tens of CAP samples so need genaral code to exatrrct them

Time Date Sample # Type Sub Chan Num Aux

[22:16:05.000 01/01/2007] 0 " 0 0 0 ## time resolution: 256

[22:16:05.000 01/01/2007] 0 0 0 0

[22:34:35.000 01/01/2007] 284160 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:35:05.000 01/01/2007] 291840 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:35:35.000 01/01/2007] 299520 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:36:05.000 01/01/2007] 307200 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:36:35.000 01/01/2007] 314880 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:37:05.000 01/01/2007] 322560 " 0 0 0 SLEEP-S1 30 S1 ROC-LOC

[22:37:35.000 01/01/2007] 330240 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:38:05.000 01/01/2007] 337920 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:38:35.000 01/01/2007] 345600 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:39:05.000 01/01/2007] 353280 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:39:35.000 01/01/2007] 360960 " 0 0 0 SLEEP-S1 30 S1 ROC-LOC

[22:40:05.000 01/01/2007] 368640 " 0 0 0 SLEEP-S1 30 S1 ROC-LOC

[22:40:35.000 01/01/2007] 376320 " 0 0 0 SLEEP-S0 30 W ROC-LOC

[22:41:05.000 01/01/2007] 384000 " 0 0 0 SLEEP-S1 30 S1 ROC-LOC

[22:41:35.000 01/01/2007] 391680 " 0 0 0 SLEEP-S1 30 S1 ROC-LOC

[22:41:37.000 01/01/2007] 392192 " 0 0 0 MCAP-A3 17 S1 EEG-F4-C4

[22:41:57.000 01/01/2007] 397312 " 0 0 0 MCAP-A3 9 S1 EEG-F4-C4

[22:42:05.000 01/01/2007] 399360 " 0 0 0 SLEEP-S2 30 S2 ROC-LOC

[22:42:13.000 01/01/2007] 401408 " 0 0 0 MCAP-A3 11 S2 EEG-F4-C4

[22:42:28.000 01/01/2007] 405248 " 0 0 0 MCAP-A3 23 S2 EEG-F4-C4

[22:42:35.000 01/01/2007] 407040 " 0 0 0 SLEEP-S2 30 S2 ROC-LOC

[22:42:57.000 01/01/2007] 412672 " 0 0 0 MCAP-A3 10 S2 EEG-F4-C4

[22:43:05.000 01/01/2007] 414720 " 0 0 0 SLEEP-S2 30 S2 ROC-LOC

[22:43:11.000 01/01/2007] 416256 " 0 0 0 MCAP-A2 6 S2 EEG-F4-C

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

per isakson 2019-4-28

See readtable Create table from file and fixedWidthImportOptions Import options object for fixed-width text files (Introduced in R2017a)

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Cedric 2019-4-27

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/458881-extract-data-from-eeg-text-file#answer_372575

编辑：Cedric 2019-4-28

在 MATLAB Online 中打开

data01.txt

Using the data text file that you provided elsewhere (renamed and attached to this answer), here is a short example of one way to parse it. Note that it is not the best way, but it is good enough for starting the discussion:

buffer = fileread( 'data01.txt' ) ;
pattern = '\[([^\]]+).\s+(\d+)\s+"\s+(\d+)\s+(\d+)\s+(\d)+\s+MCAP-(\S+)\s+(\S+)\s+(\S+)\s+(\S+)' ;
data = regexp( buffer, pattern, 'tokens' ) ;
data = vertcat( data{:} ) ;

Running it outputs a cell array of 830 rows associated with MCAP entries, as follows:

EDIT 04/28/2019@1:59pm UTC: I updated the pattern so REGEXP extracts all other "numeric" columns.

>> data
data =
  830×9 cell array
    {'22:41:37.000 01…'}    {'392192' }    {'0'}    {'0'}    {'0'}    {'A3'}    {'17'}    {'S1'}    {'EEG-F4-C4' }
    {'22:41:57.000 01…'}    {'397312' }    {'0'}    {'0'}    {'0'}    {'A3'}    {'9' }    {'S1'}    {'EEG-F4-C4' }
    {'22:42:13.000 01…'}    {'401408' }    {'0'}    {'0'}    {'0'}    {'A3'}    {'11'}    {'S2'}    {'EEG-F4-C4' }
    {'22:42:28.000 01…'}    {'405248' }    {'0'}    {'0'}    {'0'}    {'A3'}    {'23'}    {'S2'}    {'EEG-F4-C4' }
    ...
    {'07:08:22.000 02…'}    {'8175872'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'8' }    {'S4'}    {'EEG-F4-C4' }
    {'07:11:27.000 02…'}    {'8223232'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'8' }    {'S4'}    {'EEG-Fp2-F4'}
    {'07:12:08.000 02…'}    {'8233728'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'6' }    {'S4'}    {'EEG-Fp2-F4'}
    {'07:18:31.000 02…'}    {'8331776'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'6' }    {'S4'}    {'EEG-F4-C4' }
    {'07:18:53.000 02…'}    {'8337408'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'7' }    {'S4'}    {'EEG-F4-C4' }
    {'07:19:27.000 02…'}    {'8346112'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'15'}    {'S4'}    {'EEG-F4-C4' }
    {'07:20:29.000 02…'}    {'8361984'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'11'}    {'S4'}    {'EEG-F4-C4' }
    {'07:20:48.000 02…'}    {'8366848'}    {'0'}    {'0'}    {'0'}    {'A1'}    {'12'}    {'S4'}    {'EEG-F4-C4' }

Now depending what you want to accomplish, you may prefer using a TIMETABLE or a TIMESERIES object, or just some conversion of these columns.

So now you should define which part of the data you are interested in, and how you are planning to process it.

Let me know if you have any question.

21 个评论
显示 19更早的评论隐藏 19更早的评论

D. Ali 2019-4-28

It is function in physionet WFDB tool box Yes the function read all data and it was easy to use it to convert signals to physical and displayed in signal app It might be good idea to edit this code to extract MCAP samples only with the time I thought if I extract from samples text The code with rdmat function need three data files only provided in physionet

rdmat

function varargout=rdmat(varargin) [tm,signal,Fs,siginfo]=rdmat(recordName) Import a signal in physical units from a *.mat file generated by WFDB2MAT. Required Parameters: recorName String specifying the name of the *.mat file. Outputs are: tm A Nx1 array of doubles specifying the time in seconds. signal A NxM matrix of doubles contain the signals in physical units. Fs A 1x1 integer specifying the sampling frequency in Hz for the entire record. siginfo A LxN cell array specifying the signal siginfo. Currently it is a structure with the following fields: siginfo.Units siginfo.Baseline siginfo.Gain siginfo.Description NOTE: You can use the WFDB2MAT command in order to convert the record data into a *.mat file, which can then be loaded into MATLAB/Octave's workspace using the LOAD command. This sequence of procedures is quicker (by several orders of magnitude) than calling RDSAMP. The LOAD command will load the signal data in raw units, use RDMAT to load the signal in physical units. KNOWN LIMITATIONS: This function currently does support several of the features described in the WFDB record format (such as multiresolution signals) : http://www.physionet.org/physiotools/wag/header-5.htm If you are not sure that the record (or database format) you are reading is supported, you can do an integrity check by comparing the output with RDSAMP: [tm,signal,Fs,siginfo]=rdmat('200m'); [tm2,signal2]=rdsamp('200m'); if(sum(abs(signal-signal2)) !=0); error('Record not compatible with RDMAT'); end Written by Ikaro Silva, 2014 Last Modified: November 26, 2014 Version 1.2 Since 0.9.7 %Example: wfdb2mat('mitdb/200') tic;[tm,signal,Fs,siginfo]=rdmat('200m');toc tic;[signal2]=rdsamp('200m');toc sum(abs(signal-signal2)) See also RDSAMP, WFDB2MAT

Cedric 2019-5-1

编辑：Cedric 2019-5-1

在 MATLAB Online 中打开

But the problem is not to extract CAP entries, this is technical, we know how to do this.

Currently the problem, at least on my side, is that I still don't understand what you need to do with this. If I pick a series of lines associated with CAP, form the source file:

[22:41:37.000 01/01/2007]   392192     "    0    0    0	MCAP-A3 17 S1 EEG-F4-C4
[22:41:57.000 01/01/2007]   397312     "    0    0    0	MCAP-A3 9 S1 EEG-F4-C4
[22:42:13.000 01/01/2007]   401408     "    0    0    0	MCAP-A3 11 S2 EEG-F4-C4
[22:42:28.000 01/01/2007]   405248     "    0    0    0	MCAP-A3 23 S2 EEG-F4-C4

Ok, now what do I do with this? Are there data to extract from there that need to be converted to numeric for plotting? This file apparently do not contain signal information, so are these lines only defining time stamps for CAP?

If so, where do you need to add these labels? Is it on the plot of the signal(s) that you generate after calling RDMAT?

If so, is it something that is already done for all labels (how?) and you'd like to keep only CAP labels, or is it something that must be implemented from scratch?

D. Ali 2019-5-1

perfect thanks alot for your time and patience

Cedric 2019-5-2

编辑：Cedric 2019-5-2

在 MATLAB Online 中打开

No problem!

Next issue though: rdmat output arrays that suggest that there are 1e6 samples:

>> [tm,signal,Fs,siginfo]=rdmat('sdb4_edfm');
>> whos
  Name               Size                     Bytes  Class     Attributes
  Fs                 1x1                          8  double              
  siginfo            1x18                     11040  struct              
  signal       1000000x18                 144000000  double              
  tm                 1x1000000              8000000  double     

Here you see tm, the vector of times I suppose, that has 1 million elements and the array of signals has 1 million rows (I guess each corresponding to a sample).

Now after converting the sample # from you annotation file to numeric:

buffer = fileread( 'annotations sdb4.txt' ) ;
pattern = '\[([^\]]+).\s+(\d+)\s+"\s+(\d+)\s+(\d+)\s+(\d)+\s+MCAP-(\S+)\s+(\S+)\s+(\S+)\s+(\S+)' ;
annotations = regexp( buffer, pattern, 'tokens' ) ;
annotations = vertcat( annotations{:} ) ;
sampleId = str2double( annotations(:,2) ) ;

I see that sample # (or IDs) up to 8,36,6848, which is way above 1 million. So most of the sample IDs correspond to regions that are outside of the plot ..(?)

请先登录，再进行评论。

extract data from EEG text file

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

采纳的回答

21 个评论
显示 19更早的评论隐藏 19更早的评论

更多回答（0 个）

另请参阅

类别

标签

Community Treasure Hunt

extract data from EEG text file

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

采纳的回答

21 个评论 显示 19更早的评论隐藏 19更早的评论

更多回答（0 个）

另请参阅

类别

标签

Community Treasure Hunt

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

21 个评论
显示 19更早的评论隐藏 19更早的评论